[HN Gopher] PostgreSQL 17
___________________________________________________________________
PostgreSQL 17
Author : jkatz05
Score : 519 points
Date : 2024-09-26 13:10 UTC (9 hours ago)
(HTM) web link (www.postgresql.org)
(TXT) w3m dump (www.postgresql.org)
| kiwicopple wrote:
| Another amazing release, congrats to all the contributors. There
| are simply too many things to call out - just a few highlights:
|
| Massive improvements to vacuum operations:
| "PostgreSQL 17 introduces a new internal memory structure for
| vacuum that consumes up to 20x less memory."
|
| Much needed features for backups:
| "pg_basebackup, the backup utility included in PostgreSQL, now
| supports incremental backups and adds the pg_combinebackup
| utility to reconstruct a full backup"
|
| I'm a huge fan of FDW's and think they are an untapped-gem in
| Postgres, so I love seeing these improvements:
| "The PostgreSQL foreign data wrapper (postgres_fdw), used to
| execute queries on remote PostgreSQL instances, can now push
| EXISTS and IN subqueries to the remote server for more efficient
| processing."
| ellisv wrote:
| > I'm a huge fan of FDW's
|
| Do you have any recommendations on how to manage credentials
| for `CREATE USER MAPPING ` within the context of cloud hosted
| dbs?
| darth_avocado wrote:
| If your company doesn't have an internal tool for storing
| credentials, you can always store them in the cloud
| provider's secrets management tool. E.g. Secrets Manager or
| Secure String in Parameter Store on AWS. Your CI/CD pipeline
| can pull the secrets from there.
| kiwicopple wrote:
| in supabase we have a "vault" utility for this (for example:
| https://fdw.dev/catalog/clickhouse/#connecting-to-
| clickhouse). Sorry I can't make recommendations for other
| platforms because i don't want to suggest anything that could
| be considered unsafe - hopefully others can chime in
| brunoqc wrote:
| I batch import XMLs, CSVs and mssql data into postgresql.
|
| I'm pretty sure I could read them when needed with fdw. Is it a
| good idea?
|
| I think it can be slow but maybe I could use materialized views
| or something.
| kiwicopple wrote:
| "it depends". Some considerations for mssql:
|
| - If the foreign server is close (latency) that's great
|
| - if your query is complex then it helps if the postgres
| planner can "push down" to mssql. That will usually happen if
| you aren't doing joins to local data
|
| I personally like to set up the foreign tables, then
| materialize the data into a local postgres table using
| pg_cron. It's like a basic ETL pipeline completely built into
| postgres
| mind-blight wrote:
| I've been using duckdb to import data into postgres
| (especially CSVs and JSON) and it has been really effective.
|
| Duckdb can run SQL across the different data formats and
| insert or update directly into postgres. I run duckdb with
| python and Prefect for batch jobs, but you can use whatever
| language or scheduler you perfer.
|
| I can't recommend this setup enough. The only weird things
| I've run into is a really complex join across multiple
| postgres tables and parquet files had a bug reading a
| postgres column type. I simplified the query (which was a
| good idea anyways) and it hums away
| brunoqc wrote:
| Thanks. My current pattern is to parse the files with rust,
| copy from stdin into a psql temp table, update the rows
| that have changed and delete the rows not existing anymore.
|
| I'm hoping it's less wasteful than truncating and importing
| the whole table every time there is one single change.
| veggieroll wrote:
| Loving the continued push for JSON features. I'm going to get a
| lot of use out of JSON_TABLE. And json_scalar & json_serialize
| are going to be helpful at times too. JSON_QUERY with OMIT QUOTES
| is awesome too for some things.
|
| I hope SQLite3 can implement SQL/JSON soon too. I have a library
| of compatability functions to generate the appropriate JSON
| operations depending on if it's SQLite3 or PostgreSQL. And it'd
| be nice to reduce the number of incompatibilities over time.
|
| But, there's a ton of stuff in the release notes that jumped out
| at me too:
|
| "COPY .. ON_ERROR" ignore is going to be nice for loading data
| anywhere that you don't care if you get all of it. Like a dev
| environment or for just exploring something. [1]
|
| Improvements to CTE plans are always welcome. [2]
|
| "transaction_timeout" is an amazing addition to the existing
| "statement_timeout" as someone who has to keep an eye on less
| experienced people running SQL for analytics / intelligence. [3]
|
| There's a function to get the timestamp out of a UUID easily now,
| too: uuid_extract_timestamp(). This previously required a user
| defined function. So it's another streamlining thing that's nice.
| [4]
|
| I'll use the new "--exclude-extension" option for pg_dump, too. I
| just got bitten by that when moving a database. [5]
|
| "Allow unaccent character translation rules to contain whitespace
| and quotes". Wow. I needed this! [6]
|
| [1]
| https://www.postgresql.org/docs/17/release-17.html#RELEASE-1...
|
| [2]
| https://www.postgresql.org/docs/17/release-17.html#RELEASE-1...
|
| [3]
| https://www.postgresql.org/docs/17/release-17.html#RELEASE-1...
|
| [4]
| https://www.postgresql.org/docs/17/release-17.html#RELEASE-1...
|
| [5]
| https://www.postgresql.org/docs/17/release-17.html#RELEASE-1...
|
| [6]
| https://www.postgresql.org/docs/17/release-17.html#RELEASE-1...
| nbbaier wrote:
| > I hope SQLite3 can implement SQL/JSON soon too. I have a
| library of compatability functions to generate the appropriate
| JSON operations depending on if it's SQLite3 or PostgreSQL. And
| it'd be nice to reduce the number of incompatibilities over
| time.
|
| Is this available anywhere? Super interested
| veggieroll wrote:
| No, it's not at the moment. Sorry!
|
| The most useful part is doing set intersection operations on
| JSON array's. Probably the second is extracting a value by
| path across both.
|
| It's not crazy to implement, SQLite was the harder side. Just
| a bit of fiddling with `json_each`, EXISTS, and aggregate
| functions.
| nbbaier wrote:
| Might give it a whirl. Is the library pure sql or is it
| written in something else?
| veggieroll wrote:
| I've been using Go with Goqu [1] for SQL for a lot of
| things lately. But, the language shouldn't matter much.
| The functions are just building the appropriate SQL
| clauses.
|
| For some of the operations, the method I was using
| required marshaling the inputs to JSON before sending
| them over the wire. And that's nicer in a non SQL
| programming language. But both db's ultimately do have
| json_build_array/json_build_object for PostgreSQL or
| json_array/json_object for SQLite3.
|
| [1] https://github.com/doug-martin/goqu
| pbronez wrote:
| Yeah JSON_TABLE looks pretty cool. Here's the details:
|
| https://www.postgresql.org/docs/17/functions-json.html#FUNCT...
| on_the_train wrote:
| My boss insisted on the switch from oracle to mssql. Because "you
| can't trust open source for business software". Oh the pain
| gigatexal wrote:
| So from one expensive vendor to another? Your boss seems smart.
| ;-)
|
| What's the rationale? What do you gain?
| on_the_train wrote:
| Exactly. Supposedly the paid solution ensures long term
| support. The most fun part is that our customers need to buy
| these database licenses, so it directly reduces our own pay.
| Say no to non-technical (or rational) managers :<
| remram wrote:
| Did you not pay for Oracle?
| on_the_train wrote:
| We pay, but what hurts is that our customers need to pay,
| too. For both oracle and ms of course
| systems wrote:
| Well, from one VERY expensive vendor, to another considerably
| less expensive vendor
|
| Also, MSSQL have few things going for it, and surprisingly no
| one seem to be even trying to catch up -
| Their BI Stacks (PowerBI, SSAS) - Their Database
| Development (SDK) ( https://learn.microsoft.com/en-
| us/sql/tools/sql-database-projects/sql-database-
| projects?view=sql-server-ver16 )
|
| The MSSQL BI stack is unmatched , SSAS is the top star of BI
| cubes and the second option is not even close
|
| SSRS is ok, SSIS is passable , but still both are very decent
|
| PowerBI and family is also the best option for Mid to large
| (not FAANG large, but just normal large) companies
|
| And finally the GEM that is database projects, you can
| program your DB changes declaratively, there is nothing like
| this in the market and again, no one is even trying
|
| The easiest platformt todo evolutionary DB development is MS
| SQL
|
| I really wish someone will implement DB Projects (dacpac) for
| Postgresql
| manishsharan wrote:
| MS Azure SQL on is very cheap compared to any other self
| hosted database including PG and MS SQL. Unless they
| running a super heavy workload , this solution will meet
| most business requirements.
|
| But this is also a gateway drug for cloud addiction.
| kjax wrote:
| After scanning the Azure and AWS pricing pages and
| calculators for comparably speced Azure Manged SQL and
| Amazon RDS for PostgreSQL instances, AWS's RDS comes out
| at about 1/3 the price of Azure Managed SQL.
|
| This matches my experience with deployments on AWS and
| Azure. Both are managed db instances, but Azure's is
| consistently more expensive for comparable offerings on
| AWS or GCP.
| AlfeG wrote:
| Is there any analog for Azure SQL Elastic Pools for
| Postgres anywhere?
|
| We pay in total something around 600 bucks to manage
| around 250 databases in MSSQL servers (with failover for
| prod databases, DTU based model)
|
| We pay for log analytics more then we pay for Sql
| Servers.
|
| Those Elastic Pools is a blocker for us on the way to
| migrate to Postgres from MSSQL...
| ilkhan4 wrote:
| It's been a while since I've looked at elastic pools in
| Azure, but maybe Neon (https://neon.tech/) or recently
| Nile (https://www.thenile.dev/) might work in terms of
| spinning up a bunch of separate logical DBs with shared
| compute/storage.
| nikita wrote:
| (neon ceo). We have lots of examples of this. Here is one
| with Retool. https://neon.tech/blog/how-retool-uses-
| retool-and-the-neon-a...
| clarkbw wrote:
| (neon employee) and Neon will be available on Azure very
| soon ( https://neon.tech/blog/neon-is-coming-to-azure )
| manishsharan wrote:
| I do not have experience with AWS RDS so I cannot speak
| to that.
|
| In my experience, GCP Cloud SQL for Postgres has been
| more expensive than MS Azure SQL. In our tests, CloudSQL
| also was not comparable to the resiliency offered by
| Azure SQL. Things like Automated DR and automated
| failover etc. were not at par with what Azure offered.
| Not to mention , Column level encryption is standard for
| Azure SQL.
| kragen wrote:
| does 'bi' mean 'olap'?
|
| because the literal expansion 'business intelligence' (or
| the explanation in the Wikipedia article) is hard to
| interpret as something that makes sense in contexts like
| this where you're apparently talking about features of
| software rather than social practices. the reference to 'bi
| cubes' makes me think that maybe it means 'olap'?
| gigatexal wrote:
| I think so but it's overloaded and basically used for
| reporting and analysis stuff.
| systems wrote:
| DSS became BI became AI
|
| At some point in time, Decision Support Systems became
| Business Intelligence and nowadays this is switching ti
| AI
|
| BI (formerly DSS) is a set of tools that enable Business
| Analytics , OLAP and technologies that implements OLAP
| (Cubes and Columnar databases) enables BI
| setr wrote:
| bi tooling is usually report/analysis builders, intended
| to be consumed by.. business users. More importantly,
| they're usually also meant to be used by business users
| to build their own reports in the first place -- although
| in practice it often ends up going back to IT to
| maintain/update anyways. Basically anything that competes
| with excel pivot tables is a bi tool.
|
| OLAP/cubes usually underpins bi tooling, since the
| usecase is almost entirely across-the-board aggregations.
| marcosdumay wrote:
| It basically means "report editors".
|
| And no, MS ones aren't miles ahead of the competition.
| But are bundled with Office, so the competition is mostly
| going out of business by now.
| greggyb wrote:
| Except Microsoft's report editor, Power BI, includes a
| powerful columnstore database that can handle importing
| and rapidly aggregating 100Ms or even 1Bs of records with
| query response times of just a couple seconds.
|
| That has been the differentiator. Power BI has lots going
| for and against it, but its database is miles ahead of
| competitors' report tools.
|
| Edit: clarification: importing large volumes of data
| takes more than a few seconds. The query performance of
| the in-memory database after import is complete is what I
| referred to above in my original comment.
| gigatexal wrote:
| What do you love so about PowerBI? I've not looked at it
| very closely. I've worked with Tableau and Looker and
| LookerPro. All of which seemed fine.
|
| Is it a home run if the end users aren't on Windows or
| using Excel? I'm thinking about the feature where you can
| use a SQLServer DB as a data source to Excel.
| gigatexal wrote:
| I got my start in a SQLserver all MS shop.
|
| I hated SSIS. I wished I just had something akin to
| Python and dataframes to just parse things in a data
| pipeline. Instead I had some graphical tool whose error
| messages and deployments left a lot to be desired.
|
| But SQLServer and Microsoft in general make for good
| platforms for companies: they're ubiquitous enough that
| you won't find it hard to find engineers to work on your
| stack, there's support, etc
| wredue wrote:
| SSIS is for integrations, and pandas is definitely not.
| I'm not sure what you're trying to do with SSIS that
| you're also doing with pandas, but it's probably wrong.
| SSIS is far more geared to data warehousing integrations,
| while pandas would be reading a data warehouse and doing
| stuff with it. SSIS isn't really meant for processing
| incoming data, even if you can kind of hack it together
| to do that.
|
| I will say that when we want "real time" integrations,
| SSIS is phenomenally bad. But that's not entirely
| unexpected for what it is.
| gigatexal wrote:
| We don't need to be so pedantic. Python -- as it often is
| -- would be the glue, it would be the connecting part,
| and pandas (polars, duckdb, anything really) would be the
| processing part. Then once processed the outputs would be
| placed somewhere be it an update to a db table or some
| other downstream thing.
| systems wrote:
| The key feature of SSIS, is parallel dataflow
|
| You can so easily write (and schedule) parallel dataflows
| in SSIS, to do the same code using a general purpose
| programming language would be a lot harder
|
| Also remember that dataflows are data pipe streams, so
| SSIS can be very very fast
|
| Anyway, there is BIML, which allow you to create SSIS
| package by writing XML, I personally never used it,
| mainly because its licensing situation seemed weird to me
| ( i think BIML is free, but the tool support is not, and
| MS SSDT doesnt support BIML coding i thinkg)
| gigatexal wrote:
| Yeah I think I never gave it a fair shake. I think --
| like most things -- if understood and used properly it
| can be amazing.
| bradford wrote:
| > What do you love so about PowerBI?
|
| For a large portion of my career, the dashboarding
| solutions I've worked with have followed a similiar
| model: they provide a presentation layer directly on top
| of a query of some kind (usually, but not always, a SQL
| query). This seems like a natural next step for
| organizations that have a lot of data in one spot, but no
| obvious way of visualizing it.
|
| But, after implementing and maintaining
| dashboards/reports constructed in this way, big problems
| start to arise. The core of the problem is that each
| dashboard/report/visual is tightly coupled to the
| datasource that's backing it. This tight coupling leads
| to many problems which I won't enumerate.
|
| Power BI is great because it can provide an abstraction
| layer (a semantic model) on top of the many various data
| sources that might be pushed into a report. You're free
| to combine data from Excel with msSql or random Json, and
| it all plays together nicely in the same report. You can
| do data cleaning in the import stage, and the
| dimension/fact-tables pattern has been able to solve the
| wide variety of challenges that I've thrown at it.
|
| All this means that the PowerBI reports I've made have
| been far more adaptable to changes than the other tightly
| coupled solutions I've used. (I haven't used Tableau, but
| my understanding is that it provides similar modeling
| concepts to decouple the data input from the data origin.
| I'm not at all familiar with Looker).
|
| [disclaimer, I'm a Microsoft Employee (but I don't work
| in Power BI)]
| greggyb wrote:
| Power BI embeds a quite good columnstore database engine.
| It is exactly the same database engine as in the SQL
| Server Analysis Services Tabular model.
|
| For the types of queries typical in BI (primarily
| aggregations that involve lots of table scans), this
| yields great performance. Non-technically savvy users can
| import millions of rows of data and explore this with
| interactive visuals. The performance characteristics of
| the database engine allow moderately complex logic to
| remain high-performance into 10s or 100s of millions of
| records. This is on laptop-spec hardware. 100Ms of
| records will need enough supporting RAM, but the data is
| highly compressed, so this fits.
|
| The performance scaling allows simple aggregates of
| single-digit-billions-of-records to remain fast: usually
| just a couple of seconds. Performance tuning for more
| complex logic or larger datasets becomes complex.
|
| When I talk about the performance characteristics above,
| I mean for the in-memory database engine. This
| performance comes out of the box. There are no indices to
| create or typical database administration. Users set up
| an import of data and define relationships among tables,
| and define aggregation logic. The query language is DAX,
| a functional, relational language built with the goal of
| making business logic and aggregations simple to write.
|
| For simple use cases, users do not even need to write
| their logic in DAX. The viz layer automatically generated
| basic aggregates and provides a sophisticated filtering
| and interactive viz canvas experience.
|
| There is also an ETL layer, called Power Query. This has
| very different performance characteristics, and in
| general you should try to minimize your work in this tool
| and its M language, as it has very different semantics.
| The sweet spot is intra-row logic and basic filtering of
| data that will be imported into the columnstore database
| I mentioned above.
|
| The same underpinning technology, VertiPaq, supports
| columnar compression and storage/low level query in other
| Microsoft products.
|
| Columnstore indices in SQL Server also use VertiPaq for
| compression and storage engine.
|
| The data storage layer in Microsoft Fabric utilizes
| parquet files in delta tables (same tech as Databricks).
| Part of the VertiPaq engine is a set of compression
| optimization heuristics that are quite good. They apply
| this optimization to get 30%-50% compression improvements
| in the parquet files they generate. They call this
| feature V-order (v for VertiPaq).
|
| The standard compression algorithms in Parquet include
| dictionary and run-length encoding. The same are used
| (though obviously with different implementations) in
| VertiPaq's native storage format. The compression
| optimization heuristics include some for identifying
| superior sort orders to help maximize the effect of these
| styles of compression. V-order is the application of
| these heuristics to the data before applying compression
| for the parquet files.
|
| To be 100% clear, V-order is a pre-processing step that
| yields vanilla parquet files, not any sort of extension
| to the parquet format. V-order is applied automatically
| in the storage layer for Microsoft Fabric, called
| OneLake.
|
| You may come across documentation for xVelocity; this is
| an abandoned name for the same technology, VertiPaq.
|
| I can talk about this literally all day. If you've got
| specific followups, feel free to ask here or email me at
| (any name) at (my username) dot com.
| greggyb wrote:
| Follow up regarding the platform question:
|
| Power BI is a desktop application for Windows and also a
| SaaS solution for hosting the solutions you build in the
| desktop tool.
|
| The SaaS product offering is confusing for the time being
| due to the introduction of Fabric, which includes Power
| BI and a whole data stack besides.
|
| Your Power BI power users will need a Windows platform
| available, though I have seen plenty of use of Parallels
| or a remote desktop infrastructure for users not using a
| Windows device.
|
| There are authoring experiences in-browser that are
| improving and in some cases surpassing what is in Power
| BI Desktop, but if you're doing a lot of work authoring
| in Power BI, then the desktop application is where you
| want to be for now.
|
| For pure querying or _just_ data model development, there
| are third party tools that are superior to Power BI
| Desktop (disclosure: I do work part time with one of
| these third party vendors).
| sroussey wrote:
| At least both of these databases support clustered indexes.
| For decades.
| pphysch wrote:
| > And finally the GEM that is database projects, you can
| program your DB changes declaratively, there is nothing
| like this in the market and again, no one is even trying
|
| Automatic migration tools have essentially been doing this
| for a while (e.g. Django migrations). But I agree it would
| be nice if Postgres had better support built in.
| koolba wrote:
| Database migrations to arrive at a target state are an
| interesting beast. It's not just the destination that
| matters, it's how you get there.
|
| The naive approach of applying whatever DDL you need to
| turn database schema A into B is not necessarily the one
| anybody would actually want to run, and in larger
| deployments it's most definitely not the one you'd want
| to run.
|
| Intermediate availability and data migration go well
| beyond simply adding tables or columns. They're highly
| application specific.
|
| The main improvements in this space are making existing
| operations CONCURRENT (e.g., like index creation) and
| minimizing overall lock times for operations that cannot
| (e.g., adding columns with default values).
| pphysch wrote:
| I agree automating migrations correctly from state A to B
| without _any_ risk of data loss or downtime is
| essentially an intractable problem.
|
| Still, the popular tools seem to have good heuristics and
| good DBAs recognize when migrations must be written by
| hand.
| Guillaume86 wrote:
| I fully agree on the sql project type, it's amazing,
| another great side benefit is intellisense on your db
| schema (find all references, etc), git blame on the schema
| is also great. Doesn't redgate offer kinda the same tool
| commercially with support for more than mssql? Never tried
| it but IIRC it looked similar...
| 0cf8612b2e1e wrote:
| A boolean column type.
| marcosdumay wrote:
| Microsoft is way less likely to sue you and a couple of
| orders of magnitude cheaper than Oracle.
|
| Besides, managing Microsoft licensing is a bliss close to
| Oracle's. And yeah, MSSQL is much better in almost every way
| than Oracle.
|
| If you only compare those two, it's a non-brainier.
| jmull wrote:
| Well, you can't _necessarily_ trust open source for business
| software.
|
| The more deeply your business depends on something, the more
| careful you need to be when selecting the source for that
| something. (And the db is often very deeply depended on.)
|
| You want to see why their long-term incentives align with your
| own needs.
|
| But a revenue stream is just one way to do this, and not a
| perfect one. (Case in point: Oracle.)
|
| In my experience, SQL Server isn't bad though. I know a couple
| commercial products that started with SQL Server in the late
| '90s and remain happy with it now. The cost hasn't been a
| problem and they like the support and evolution of the product.
| They find they can adopt newer versions and features when they
| need to without too much pain/cost/time.
|
| (Not that I don't think Postgres is generally a better green-
| field pick, though, and even more so porting away from Oracle.)
| bityard wrote:
| I ran into a lot of that 20 years ago, surprised to hear it's
| still a thing at all given how it's basically common knowledge
| that most of the Internet and Cloud run on open source
| software.
|
| I once met an older gentleman who was doing IT work for a
| defense contractor. He seemed nice enough. We were making small
| talk and I happened to mention that I had recently installed
| Linux on my computer at home. He tone changed almost
| immediately and he started ranting about how Linux was pirated
| source code, stolen from Microsoft, all of it contains viruses,
| etc. He was talking about the SCO vs Linux lawsuits but of
| course got absolutely ALL of the details wrong, like which
| companies were even involved in the lawsuits. He was so far off
| the deep end that I didn't even try to correct him, I just
| nodded and smiled and said I was actually running late to be
| somewhere else...
| throw351203910 wrote:
| What your boss doesn't realize is your business already depends
| on FOSS. Here are a few examples:
|
| - Do you use any cloud provider? Those platforms are built on
| top of open source software: Linux, nginx (e.g Cloudflare's
| edge servers before the rust rewrite), ha-proxy (AWS ELB), etc
| - Either the software your business builds or depends on
| probably uses open source libraries (e.g: libc, etc) - The
| programming languages your business uses directly or indirectly
| are probably open source
|
| My point is that folks that make these kinds of statements have
| no clue how their software is built or what kind software their
| business actually depends on.
| baq wrote:
| mssql is a _great_ rdbms. t-sql is... different... in certain
| places but all in all if cost isn 't a big issue you really
| can't go wrong by picking it.
| pestaa wrote:
| Very impressive changelog.
|
| Bit sad the UUIDv7 PR didn't make the cut just yet:
|
| https://commitfest.postgresql.org/49/4388/
| ellisv wrote:
| I've been waiting for "incremental view maintenance" (i.e.
| incremental updates for materialized views) but it looks like
| it's still a few years out.
| whitepoplar wrote:
| There's always the pg_ivm extension you can use in the
| meantime: https://github.com/sraoss/pg_ivm
| ellisv wrote:
| Unfortunately we use a cloud provider to host our
| databases, so I can only install limited extensions.
| gregwebs wrote:
| REFRESH CONCURRENTLY is already an incremental update of
| sorts although you still pay the price of running a full
| query.
| isoprophlex wrote:
| Wow, brilliant! I never knew this existed. Going to try
| this out tomorrow, first thing!
| JamesSwift wrote:
| I'm a huge fan of views to serve as the v1 solution for
| problems before we need to optimize our approach, and this is
| the main thing that serves as a blocker in those discussions.
| If only we were able to have v2 of the approach be an IVM-
| view, we could leverage them much more widely.
| 0cf8612b2e1e wrote:
| I must be missing something because that feels easy to
| implement. A date seconds + random data in the same way as
| UUID4.
|
| Where is the iceberg complexity?
| lfittl wrote:
| In my understanding it was a timing issue with the UUIDv7 RFC
| not being finalized before the Postgres 17 feature freeze in
| early April. Shouldn't be an issue to get this in for
| Postgres 18, I think.
| 0cf8612b2e1e wrote:
| That would do it. I was mistakenly thinking v7 was done for
| months/years at this point.
| miohtama wrote:
| The titan keeps rocking.
| yen223 wrote:
| MERGE support for updating views is huge. So looking forward to
| this
| majkinetor wrote:
| Now it also supports RETURNS keyword !
| jackschultz wrote:
| Very cool with the JSON_TABLE. The style of putting json response
| (from API, created from scraping, ect.) into jsonb column and
| then writing a view on top to parse / flatten is something I've
| been doing for a few years now. I've found it really great to put
| the json into a table, somewhere safe, and then do the parsing
| rather than dealing with possible errors on the scripting
| language side. I haven't seen this style been used in other
| places before, and to see it in the docs as a feature from new
| postgres makes me feel a bit more sane. Will be cool to try this
| out and see the differences from what I was doing before!
| ellisv wrote:
| It is definitely an improvement on multiple
| `JSONB_TO_RECORDSET` and `JSONB_TO_RECORD` calls for flattening
| nested json.
| abyesilyurt wrote:
| > putting json response (from API, created from scraping, ect.)
| into jsonb column and then writing a view on top to parse
|
| That's a very good idea!
| 0cf8612b2e1e wrote:
| A personal rule of mine is to always separate data
| receipt+storage from parsing. The retrieval is comparatively
| very expensive and has few possible failure modes. Parsing can
| always fail in new and exciting ways.
|
| Disk space to store the returned data is cheap and can be
| periodically flushed only when you are certain the content has
| been properly extracted.
| erichocean wrote:
| I ended up with the same design after encountering numerous
| exotic failure modes.
| ksec wrote:
| With 17, is Vacuum largely a solved issue?
| ses1984 wrote:
| I'm not up to date on the recent changes, but problems we had
| with vacuum were more computation and iops related than memory
| related.
|
| Basically in a database with a lot of creation/deletion,
| database activity can outrun the vacuum, leading to out of
| storage errors, lock contention, etc
|
| In order to keep throughput up, we had to throttle things
| manually on the input side, to allow vacuum to complete.
| Otherwise throughput would eventually drop to zero.
| sgarland wrote:
| You can also tune various [auto]vacuum settings that can
| dramatically increase the amount of work done in each cycle.
|
| I'm not discounting your experience as anything is possible,
| but I've never had to throttle writes, even on large clusters
| with hundreds of thousands of QPS.
| jabl wrote:
| There was a big project to re-architect the low level storage
| system to something that isn't dependent on vacuuming, called
| zheap. Unfortunately it seems to have stalled and nobody seems
| to be working on it anymore? I keep scanning the release notes
| for each new pgsql version, but no dice.
| qazxcvbnm wrote:
| I think OrioleDB is picking up from where zheap left things,
| and seems quite active in upstreaming their work, you might
| be interested to check it out.
| imbradn wrote:
| No, vacuum issues are not solved. This will reduce the amount
| of scanning needed in many cases when vacuuming indexes. It
| will mean more efficient vacuum and quicker vacuums which will
| help in a lot of cases.
| ktosobcy wrote:
| Would be awesome if PostgreSQL would finally add support for
| seamless major version upgrade...
| ellisv wrote:
| I'm curious what you feel is specifically missing.
| levkk wrote:
| pg_upgrade is a bit manual at the moment. If the database
| could just be pointed to a data directory and update it
| automatically on startup, that would be great.
| olavgg wrote:
| I agree, why is this still needed? It can run pg_upgrade in
| the background.
| vbezhenar wrote:
| It needs binaries for both old and new versions for some
| reason.
| dewey wrote:
| Maybe https://github.com/pgautoupgrade/docker-pgautoupgrade is
| interesting for you.
| ticoombs wrote:
| I use this for my Lemmy instance & lemmy-ansible and it's
| been great! No longer having to support upgrade scripts and
| write a complete upgrade process[1] for people to follow has
| made my life a lot easier! Amazing product
|
| - [1] https://github.com/LemmyNet/lemmy-
| ansible/blob/main/UPGRADIN...
| h1fra wrote:
| Amazing, it's been a long time since I have been that much
| excited by a software release!
| monkaiju wrote:
| PG just continues to impress!
| clarkbw wrote:
| Some awesome quality-of-life improvements here as well. The
| random function now takes min, max parameters
|
| SELECT random(1, 10) AS random_number;
| clarkbw wrote:
| It's never available on homebrew the same day so we all worked
| hard to make it available the same day on Neon. If you want to
| try out the JSON_TABLE and MERGE RETURNING features you can
| spin up a free instance quickly.
|
| https://neon.tech/blog/postgres-17
|
| (note that not all extensions are available yet, that takes
| some time still)
| dewey wrote:
| How is your hosted managed Postgres platform an alternative
| to Homebrew (Installing PG on your local Mac)? There's also
| many Homebrew formulas that have PG 17 already like
| (https://github.com/petere/homebrew-
| postgresql/commit/2faf438...).
| clarkbw wrote:
| oh, i see I'm getting downvoted; this isn't a sales pitch
| for my "managed postgres platform".
|
| every year a new postgres release occurs and i want to try
| out some of the features i have to find a way to get it.
| usually nobody has it available.
|
| here's the list of homebrew options I see right now:
|
| brew formulae | grep postgresql@ postgresql@10
| postgresql@11 postgresql@12 postgresql@13 postgresql@14
| postgresql@15 postgresql@16
|
| maybe you're seeing otherwise but i updated a min ago and
| 17 isn't there yet. even searching for 'postgres' on
| homebrew doesn't reveal any options for 17. i don't know
| where you've found those but it doesn't seem easily
| available.
|
| and i'm not suggesting you use a cloud service as an
| alternative to homebrew or local development. neon is pure
| postgres, the local service is the same as whats in the
| cloud. but right now there isn't an easy local version and
| i wanted everyone else to be able to try it quickly.
| paws wrote:
| I'm looking forward to trying out incremental backups as well as
| JSON_TABLE.
|
| Thank you contributors!
| andreashansen wrote:
| Oh how I wish for Postgres to introduce system-versioned (bi-
| temporal) tables.
| qianli_cs wrote:
| What's your use case for system-versioned tables? You could use
| some extensions like Periods that support bi-temporal tables:
| https://wiki.postgresql.org/wiki/Temporal_Extensions
|
| Or you could use triggers to build one:
| https://hypirion.com/musings/implementing-system-versioned-t...
| baq wrote:
| any system of record has a requirement to be bitemporal, it
| just isn't discovered until too late IME. I don't know if
| there's a system anywhere which conciously decided to not be
| bitemporal during initial design.
| jeltz wrote:
| It will hopefully be in PostgreSQL 18.
| lpapez wrote:
| Amazing release, Postgres is a gift that keeps on giving.
|
| I hope to one day see Incremental View Maintenance extension
| (IVM) be turned into a first class feature, it's the only thing I
| need regularly which isn't immediately there!
| nikita wrote:
| A number of features stood out to me in this release:
|
| 1. Chipping away more at vacuum. Fundamentally Postgres doesn't
| have undo log and therefore has to have vacuum. It's a trade-off
| of fast recovery vs well.. having to vacuum. The unfortunate part
| about vacuum is that it adds load to the system exactly when the
| system needs all the resources. I hope one day people stop
| knowing that vacuum exists, we are one step closer, but not
| there.
|
| 2. Performance gets better and not worse. Mark Callaghan blogs
| about MySQL and Postgres performance changes over time and MySQL
| keep regressing performance while Postgres keeps improving.
|
| https://x.com/MarkCallaghanDB https://smalldatum.blogspot.com/
|
| 3. JSON. Postgres keep improving QOL for the interop with JS and
| TS.
|
| 4. Logical replication is becoming a super robust way of moving
| data in and out. This is very useful when you move data from one
| instance to another especially if version numbers don't match.
| Recently we have been using it to move at the speed of 1Gb/s
|
| 5. Optimizer. The better the optimizer the less you think about
| the optimizer. According to the research community SQL Server has
| the best optimizer. It's very encouraging that every release PG
| Optimizer gets better.
| netcraft wrote:
| I remember when the json stuff started coming out, I thought it
| was interesting but nothing I would ever want to rely on - boy
| was I wrong!
|
| It is so nice having json functionality in a relational db - even
| if you never actually store json in your database, its useful in
| so many situations.
|
| Being able to generate json in a query from your data is a big
| deal too.
|
| Looking forward to really learning json_table
| openrisk wrote:
| There was LAMP and then MERN and MEAN etc. and then there was
| Postgres.
|
| Its not quite visible yet, but all this progres by postgres
| (excuse the pun) on making JSON more deeply integrated with
| relational principles will surely at some point enable a new
| paradigm, at least for full stack web frameworks?
| winrid wrote:
| The problem with postgres's JSON support is it leads to last
| write win race conditions compared to actual document stores.
| Once they can support operations like incrementing a number
| while someone else updates another field, without locking the
| row, then maybe.
|
| I don't think you'll see a paradigm shift. You'll just see
| people using other documents stores less.
| sgarland wrote:
| We're already in a new paradigm, one in which web devs are
| abandoning referential integrity guarantees in favor of not
| having to think about data modeling.
|
| I'd say they're then surprised when their DB calls are slow
| (nothing to do with referential integrity, just the
| TOAST/DETOAST overhead), but since they also haven't
| experienced how fast a DB with local disks and good data
| locality can be, they have no idea.
| switch007 wrote:
| Props to all the people (and companies) behind Postgres
| https://www.postgresql.org/community/contributors/
| Rican7 wrote:
| Wow, yea, the performance gains and new UX features (JSON_TABLE,
| MERGE improvements, etc) are huge here, but these really stand
| out to me:
|
| > PostgreSQL 17 supports using identity columns and exclusion
| constraints on partitioned tables.
|
| > PostgreSQL 17 also includes a built-in, platform independent,
| immutable collation provider that's guaranteed to be immutable
| and provides similar sorting semantics to the C collation except
| with UTF-8 encoding rather than SQL_ASCII. Using this new
| collation provider guarantees that your text-based queries will
| return the same sorted results regardless of where you run
| PostgreSQL.
| __natty__ wrote:
| This release is good on so many levels. Just the performance
| optimisations and the JSON TABLE feature could be entirely
| separated release, but we got so much more.
| nojvek wrote:
| I wish postgres supports parquet file imports and exports. COPY
| command with csv is really slooooooooow. Even BINARY is quite
| slow and bandwidth heavy.
|
| I wonder how open postgres is and what kind of pull requests
| postgres team considers? I'd like to learn how to contribute to
| PG in baby steps and eventually get to a place where I could
| contribute substantial features.
| ioltas wrote:
| There has been a patch to extend the COPY code with pluggable
| APIs, adding callbacks at start, end, and for each row
| processed: https://commitfest.postgresql.org/49/4681/.
|
| I'd guess that this may fit your purpose to add a custom format
| without having to fork upstream.
___________________________________________________________________
(page generated 2024-09-26 23:01 UTC)