hngopher.com

       [HN Gopher] PostgreSQL 17
       ___________________________________________________________________
        
       PostgreSQL 17
        
       Author : jkatz05
       Score  : 519 points
       Date   : 2024-09-26 13:10 UTC (9 hours ago)
        
 (HTM) web link (www.postgresql.org)
 (TXT) w3m dump (www.postgresql.org)
        
       | kiwicopple wrote:
       | Another amazing release, congrats to all the contributors. There
       | are simply too many things to call out - just a few highlights:
       | 
       | Massive improvements to vacuum operations:
       | "PostgreSQL 17 introduces a new internal memory structure for
       | vacuum that consumes up to 20x less memory."
       | 
       | Much needed features for backups:
       | "pg_basebackup, the backup utility included in PostgreSQL, now
       | supports incremental backups and adds the pg_combinebackup
       | utility to reconstruct a full backup"
       | 
       | I'm a huge fan of FDW's and think they are an untapped-gem in
       | Postgres, so I love seeing these improvements:
       | "The PostgreSQL foreign data wrapper (postgres_fdw), used to
       | execute queries on remote PostgreSQL instances, can now push
       | EXISTS and IN subqueries to the remote server for more efficient
       | processing."
        
         | ellisv wrote:
         | > I'm a huge fan of FDW's
         | 
         | Do you have any recommendations on how to manage credentials
         | for `CREATE USER MAPPING ` within the context of cloud hosted
         | dbs?
        
           | darth_avocado wrote:
           | If your company doesn't have an internal tool for storing
           | credentials, you can always store them in the cloud
           | provider's secrets management tool. E.g. Secrets Manager or
           | Secure String in Parameter Store on AWS. Your CI/CD pipeline
           | can pull the secrets from there.
        
           | kiwicopple wrote:
           | in supabase we have a "vault" utility for this (for example:
           | https://fdw.dev/catalog/clickhouse/#connecting-to-
           | clickhouse). Sorry I can't make recommendations for other
           | platforms because i don't want to suggest anything that could
           | be considered unsafe - hopefully others can chime in
        
         | brunoqc wrote:
         | I batch import XMLs, CSVs and mssql data into postgresql.
         | 
         | I'm pretty sure I could read them when needed with fdw. Is it a
         | good idea?
         | 
         | I think it can be slow but maybe I could use materialized views
         | or something.
        
           | kiwicopple wrote:
           | "it depends". Some considerations for mssql:
           | 
           | - If the foreign server is close (latency) that's great
           | 
           | - if your query is complex then it helps if the postgres
           | planner can "push down" to mssql. That will usually happen if
           | you aren't doing joins to local data
           | 
           | I personally like to set up the foreign tables, then
           | materialize the data into a local postgres table using
           | pg_cron. It's like a basic ETL pipeline completely built into
           | postgres
        
           | mind-blight wrote:
           | I've been using duckdb to import data into postgres
           | (especially CSVs and JSON) and it has been really effective.
           | 
           | Duckdb can run SQL across the different data formats and
           | insert or update directly into postgres. I run duckdb with
           | python and Prefect for batch jobs, but you can use whatever
           | language or scheduler you perfer.
           | 
           | I can't recommend this setup enough. The only weird things
           | I've run into is a really complex join across multiple
           | postgres tables and parquet files had a bug reading a
           | postgres column type. I simplified the query (which was a
           | good idea anyways) and it hums away
        
             | brunoqc wrote:
             | Thanks. My current pattern is to parse the files with rust,
             | copy from stdin into a psql temp table, update the rows
             | that have changed and delete the rows not existing anymore.
             | 
             | I'm hoping it's less wasteful than truncating and importing
             | the whole table every time there is one single change.
        
       | veggieroll wrote:
       | Loving the continued push for JSON features. I'm going to get a
       | lot of use out of JSON_TABLE. And json_scalar & json_serialize
       | are going to be helpful at times too. JSON_QUERY with OMIT QUOTES
       | is awesome too for some things.
       | 
       | I hope SQLite3 can implement SQL/JSON soon too. I have a library
       | of compatability functions to generate the appropriate JSON
       | operations depending on if it's SQLite3 or PostgreSQL. And it'd
       | be nice to reduce the number of incompatibilities over time.
       | 
       | But, there's a ton of stuff in the release notes that jumped out
       | at me too:
       | 
       | "COPY .. ON_ERROR" ignore is going to be nice for loading data
       | anywhere that you don't care if you get all of it. Like a dev
       | environment or for just exploring something. [1]
       | 
       | Improvements to CTE plans are always welcome. [2]
       | 
       | "transaction_timeout" is an amazing addition to the existing
       | "statement_timeout" as someone who has to keep an eye on less
       | experienced people running SQL for analytics / intelligence. [3]
       | 
       | There's a function to get the timestamp out of a UUID easily now,
       | too: uuid_extract_timestamp(). This previously required a user
       | defined function. So it's another streamlining thing that's nice.
       | [4]
       | 
       | I'll use the new "--exclude-extension" option for pg_dump, too. I
       | just got bitten by that when moving a database. [5]
       | 
       | "Allow unaccent character translation rules to contain whitespace
       | and quotes". Wow. I needed this! [6]
       | 
       | [1]
       | https://www.postgresql.org/docs/17/release-17.html#RELEASE-1...
       | 
       | [2]
       | https://www.postgresql.org/docs/17/release-17.html#RELEASE-1...
       | 
       | [3]
       | https://www.postgresql.org/docs/17/release-17.html#RELEASE-1...
       | 
       | [4]
       | https://www.postgresql.org/docs/17/release-17.html#RELEASE-1...
       | 
       | [5]
       | https://www.postgresql.org/docs/17/release-17.html#RELEASE-1...
       | 
       | [6]
       | https://www.postgresql.org/docs/17/release-17.html#RELEASE-1...
        
         | nbbaier wrote:
         | > I hope SQLite3 can implement SQL/JSON soon too. I have a
         | library of compatability functions to generate the appropriate
         | JSON operations depending on if it's SQLite3 or PostgreSQL. And
         | it'd be nice to reduce the number of incompatibilities over
         | time.
         | 
         | Is this available anywhere? Super interested
        
           | veggieroll wrote:
           | No, it's not at the moment. Sorry!
           | 
           | The most useful part is doing set intersection operations on
           | JSON array's. Probably the second is extracting a value by
           | path across both.
           | 
           | It's not crazy to implement, SQLite was the harder side. Just
           | a bit of fiddling with `json_each`, EXISTS, and aggregate
           | functions.
        
             | nbbaier wrote:
             | Might give it a whirl. Is the library pure sql or is it
             | written in something else?
        
               | veggieroll wrote:
               | I've been using Go with Goqu [1] for SQL for a lot of
               | things lately. But, the language shouldn't matter much.
               | The functions are just building the appropriate SQL
               | clauses.
               | 
               | For some of the operations, the method I was using
               | required marshaling the inputs to JSON before sending
               | them over the wire. And that's nicer in a non SQL
               | programming language. But both db's ultimately do have
               | json_build_array/json_build_object for PostgreSQL or
               | json_array/json_object for SQLite3.
               | 
               | [1] https://github.com/doug-martin/goqu
        
         | pbronez wrote:
         | Yeah JSON_TABLE looks pretty cool. Here's the details:
         | 
         | https://www.postgresql.org/docs/17/functions-json.html#FUNCT...
        
       | on_the_train wrote:
       | My boss insisted on the switch from oracle to mssql. Because "you
       | can't trust open source for business software". Oh the pain
        
         | gigatexal wrote:
         | So from one expensive vendor to another? Your boss seems smart.
         | ;-)
         | 
         | What's the rationale? What do you gain?
        
           | on_the_train wrote:
           | Exactly. Supposedly the paid solution ensures long term
           | support. The most fun part is that our customers need to buy
           | these database licenses, so it directly reduces our own pay.
           | Say no to non-technical (or rational) managers :<
        
             | remram wrote:
             | Did you not pay for Oracle?
        
               | on_the_train wrote:
               | We pay, but what hurts is that our customers need to pay,
               | too. For both oracle and ms of course
        
           | systems wrote:
           | Well, from one VERY expensive vendor, to another considerably
           | less expensive vendor
           | 
           | Also, MSSQL have few things going for it, and surprisingly no
           | one seem to be even trying to catch up                   -
           | Their BI Stacks (PowerBI, SSAS)         - Their Database
           | Development (SDK) ( https://learn.microsoft.com/en-
           | us/sql/tools/sql-database-projects/sql-database-
           | projects?view=sql-server-ver16 )
           | 
           | The MSSQL BI stack is unmatched , SSAS is the top star of BI
           | cubes and the second option is not even close
           | 
           | SSRS is ok, SSIS is passable , but still both are very decent
           | 
           | PowerBI and family is also the best option for Mid to large
           | (not FAANG large, but just normal large) companies
           | 
           | And finally the GEM that is database projects, you can
           | program your DB changes declaratively, there is nothing like
           | this in the market and again, no one is even trying
           | 
           | The easiest platformt todo evolutionary DB development is MS
           | SQL
           | 
           | I really wish someone will implement DB Projects (dacpac) for
           | Postgresql
        
             | manishsharan wrote:
             | MS Azure SQL on is very cheap compared to any other self
             | hosted database including PG and MS SQL. Unless they
             | running a super heavy workload , this solution will meet
             | most business requirements.
             | 
             | But this is also a gateway drug for cloud addiction.
        
               | kjax wrote:
               | After scanning the Azure and AWS pricing pages and
               | calculators for comparably speced Azure Manged SQL and
               | Amazon RDS for PostgreSQL instances, AWS's RDS comes out
               | at about 1/3 the price of Azure Managed SQL.
               | 
               | This matches my experience with deployments on AWS and
               | Azure. Both are managed db instances, but Azure's is
               | consistently more expensive for comparable offerings on
               | AWS or GCP.
        
               | AlfeG wrote:
               | Is there any analog for Azure SQL Elastic Pools for
               | Postgres anywhere?
               | 
               | We pay in total something around 600 bucks to manage
               | around 250 databases in MSSQL servers (with failover for
               | prod databases, DTU based model)
               | 
               | We pay for log analytics more then we pay for Sql
               | Servers.
               | 
               | Those Elastic Pools is a blocker for us on the way to
               | migrate to Postgres from MSSQL...
        
               | ilkhan4 wrote:
               | It's been a while since I've looked at elastic pools in
               | Azure, but maybe Neon (https://neon.tech/) or recently
               | Nile (https://www.thenile.dev/) might work in terms of
               | spinning up a bunch of separate logical DBs with shared
               | compute/storage.
        
               | nikita wrote:
               | (neon ceo). We have lots of examples of this. Here is one
               | with Retool. https://neon.tech/blog/how-retool-uses-
               | retool-and-the-neon-a...
        
               | clarkbw wrote:
               | (neon employee) and Neon will be available on Azure very
               | soon ( https://neon.tech/blog/neon-is-coming-to-azure )
        
               | manishsharan wrote:
               | I do not have experience with AWS RDS so I cannot speak
               | to that.
               | 
               | In my experience, GCP Cloud SQL for Postgres has been
               | more expensive than MS Azure SQL. In our tests, CloudSQL
               | also was not comparable to the resiliency offered by
               | Azure SQL. Things like Automated DR and automated
               | failover etc. were not at par with what Azure offered.
               | Not to mention , Column level encryption is standard for
               | Azure SQL.
        
             | kragen wrote:
             | does 'bi' mean 'olap'?
             | 
             | because the literal expansion 'business intelligence' (or
             | the explanation in the Wikipedia article) is hard to
             | interpret as something that makes sense in contexts like
             | this where you're apparently talking about features of
             | software rather than social practices. the reference to 'bi
             | cubes' makes me think that maybe it means 'olap'?
        
               | gigatexal wrote:
               | I think so but it's overloaded and basically used for
               | reporting and analysis stuff.
        
               | systems wrote:
               | DSS became BI became AI
               | 
               | At some point in time, Decision Support Systems became
               | Business Intelligence and nowadays this is switching ti
               | AI
               | 
               | BI (formerly DSS) is a set of tools that enable Business
               | Analytics , OLAP and technologies that implements OLAP
               | (Cubes and Columnar databases) enables BI
        
               | setr wrote:
               | bi tooling is usually report/analysis builders, intended
               | to be consumed by.. business users. More importantly,
               | they're usually also meant to be used by business users
               | to build their own reports in the first place -- although
               | in practice it often ends up going back to IT to
               | maintain/update anyways. Basically anything that competes
               | with excel pivot tables is a bi tool.
               | 
               | OLAP/cubes usually underpins bi tooling, since the
               | usecase is almost entirely across-the-board aggregations.
        
               | marcosdumay wrote:
               | It basically means "report editors".
               | 
               | And no, MS ones aren't miles ahead of the competition.
               | But are bundled with Office, so the competition is mostly
               | going out of business by now.
        
               | greggyb wrote:
               | Except Microsoft's report editor, Power BI, includes a
               | powerful columnstore database that can handle importing
               | and rapidly aggregating 100Ms or even 1Bs of records with
               | query response times of just a couple seconds.
               | 
               | That has been the differentiator. Power BI has lots going
               | for and against it, but its database is miles ahead of
               | competitors' report tools.
               | 
               | Edit: clarification: importing large volumes of data
               | takes more than a few seconds. The query performance of
               | the in-memory database after import is complete is what I
               | referred to above in my original comment.
        
             | gigatexal wrote:
             | What do you love so about PowerBI? I've not looked at it
             | very closely. I've worked with Tableau and Looker and
             | LookerPro. All of which seemed fine.
             | 
             | Is it a home run if the end users aren't on Windows or
             | using Excel? I'm thinking about the feature where you can
             | use a SQLServer DB as a data source to Excel.
        
               | gigatexal wrote:
               | I got my start in a SQLserver all MS shop.
               | 
               | I hated SSIS. I wished I just had something akin to
               | Python and dataframes to just parse things in a data
               | pipeline. Instead I had some graphical tool whose error
               | messages and deployments left a lot to be desired.
               | 
               | But SQLServer and Microsoft in general make for good
               | platforms for companies: they're ubiquitous enough that
               | you won't find it hard to find engineers to work on your
               | stack, there's support, etc
        
               | wredue wrote:
               | SSIS is for integrations, and pandas is definitely not.
               | I'm not sure what you're trying to do with SSIS that
               | you're also doing with pandas, but it's probably wrong.
               | SSIS is far more geared to data warehousing integrations,
               | while pandas would be reading a data warehouse and doing
               | stuff with it. SSIS isn't really meant for processing
               | incoming data, even if you can kind of hack it together
               | to do that.
               | 
               | I will say that when we want "real time" integrations,
               | SSIS is phenomenally bad. But that's not entirely
               | unexpected for what it is.
        
               | gigatexal wrote:
               | We don't need to be so pedantic. Python -- as it often is
               | -- would be the glue, it would be the connecting part,
               | and pandas (polars, duckdb, anything really) would be the
               | processing part. Then once processed the outputs would be
               | placed somewhere be it an update to a db table or some
               | other downstream thing.
        
               | systems wrote:
               | The key feature of SSIS, is parallel dataflow
               | 
               | You can so easily write (and schedule) parallel dataflows
               | in SSIS, to do the same code using a general purpose
               | programming language would be a lot harder
               | 
               | Also remember that dataflows are data pipe streams, so
               | SSIS can be very very fast
               | 
               | Anyway, there is BIML, which allow you to create SSIS
               | package by writing XML, I personally never used it,
               | mainly because its licensing situation seemed weird to me
               | ( i think BIML is free, but the tool support is not, and
               | MS SSDT doesnt support BIML coding i thinkg)
        
               | gigatexal wrote:
               | Yeah I think I never gave it a fair shake. I think --
               | like most things -- if understood and used properly it
               | can be amazing.
        
               | bradford wrote:
               | > What do you love so about PowerBI?
               | 
               | For a large portion of my career, the dashboarding
               | solutions I've worked with have followed a similiar
               | model: they provide a presentation layer directly on top
               | of a query of some kind (usually, but not always, a SQL
               | query). This seems like a natural next step for
               | organizations that have a lot of data in one spot, but no
               | obvious way of visualizing it.
               | 
               | But, after implementing and maintaining
               | dashboards/reports constructed in this way, big problems
               | start to arise. The core of the problem is that each
               | dashboard/report/visual is tightly coupled to the
               | datasource that's backing it. This tight coupling leads
               | to many problems which I won't enumerate.
               | 
               | Power BI is great because it can provide an abstraction
               | layer (a semantic model) on top of the many various data
               | sources that might be pushed into a report. You're free
               | to combine data from Excel with msSql or random Json, and
               | it all plays together nicely in the same report. You can
               | do data cleaning in the import stage, and the
               | dimension/fact-tables pattern has been able to solve the
               | wide variety of challenges that I've thrown at it.
               | 
               | All this means that the PowerBI reports I've made have
               | been far more adaptable to changes than the other tightly
               | coupled solutions I've used. (I haven't used Tableau, but
               | my understanding is that it provides similar modeling
               | concepts to decouple the data input from the data origin.
               | I'm not at all familiar with Looker).
               | 
               | [disclaimer, I'm a Microsoft Employee (but I don't work
               | in Power BI)]
        
               | greggyb wrote:
               | Power BI embeds a quite good columnstore database engine.
               | It is exactly the same database engine as in the SQL
               | Server Analysis Services Tabular model.
               | 
               | For the types of queries typical in BI (primarily
               | aggregations that involve lots of table scans), this
               | yields great performance. Non-technically savvy users can
               | import millions of rows of data and explore this with
               | interactive visuals. The performance characteristics of
               | the database engine allow moderately complex logic to
               | remain high-performance into 10s or 100s of millions of
               | records. This is on laptop-spec hardware. 100Ms of
               | records will need enough supporting RAM, but the data is
               | highly compressed, so this fits.
               | 
               | The performance scaling allows simple aggregates of
               | single-digit-billions-of-records to remain fast: usually
               | just a couple of seconds. Performance tuning for more
               | complex logic or larger datasets becomes complex.
               | 
               | When I talk about the performance characteristics above,
               | I mean for the in-memory database engine. This
               | performance comes out of the box. There are no indices to
               | create or typical database administration. Users set up
               | an import of data and define relationships among tables,
               | and define aggregation logic. The query language is DAX,
               | a functional, relational language built with the goal of
               | making business logic and aggregations simple to write.
               | 
               | For simple use cases, users do not even need to write
               | their logic in DAX. The viz layer automatically generated
               | basic aggregates and provides a sophisticated filtering
               | and interactive viz canvas experience.
               | 
               | There is also an ETL layer, called Power Query. This has
               | very different performance characteristics, and in
               | general you should try to minimize your work in this tool
               | and its M language, as it has very different semantics.
               | The sweet spot is intra-row logic and basic filtering of
               | data that will be imported into the columnstore database
               | I mentioned above.
               | 
               | The same underpinning technology, VertiPaq, supports
               | columnar compression and storage/low level query in other
               | Microsoft products.
               | 
               | Columnstore indices in SQL Server also use VertiPaq for
               | compression and storage engine.
               | 
               | The data storage layer in Microsoft Fabric utilizes
               | parquet files in delta tables (same tech as Databricks).
               | Part of the VertiPaq engine is a set of compression
               | optimization heuristics that are quite good. They apply
               | this optimization to get 30%-50% compression improvements
               | in the parquet files they generate. They call this
               | feature V-order (v for VertiPaq).
               | 
               | The standard compression algorithms in Parquet include
               | dictionary and run-length encoding. The same are used
               | (though obviously with different implementations) in
               | VertiPaq's native storage format. The compression
               | optimization heuristics include some for identifying
               | superior sort orders to help maximize the effect of these
               | styles of compression. V-order is the application of
               | these heuristics to the data before applying compression
               | for the parquet files.
               | 
               | To be 100% clear, V-order is a pre-processing step that
               | yields vanilla parquet files, not any sort of extension
               | to the parquet format. V-order is applied automatically
               | in the storage layer for Microsoft Fabric, called
               | OneLake.
               | 
               | You may come across documentation for xVelocity; this is
               | an abandoned name for the same technology, VertiPaq.
               | 
               | I can talk about this literally all day. If you've got
               | specific followups, feel free to ask here or email me at
               | (any name) at (my username) dot com.
        
               | greggyb wrote:
               | Follow up regarding the platform question:
               | 
               | Power BI is a desktop application for Windows and also a
               | SaaS solution for hosting the solutions you build in the
               | desktop tool.
               | 
               | The SaaS product offering is confusing for the time being
               | due to the introduction of Fabric, which includes Power
               | BI and a whole data stack besides.
               | 
               | Your Power BI power users will need a Windows platform
               | available, though I have seen plenty of use of Parallels
               | or a remote desktop infrastructure for users not using a
               | Windows device.
               | 
               | There are authoring experiences in-browser that are
               | improving and in some cases surpassing what is in Power
               | BI Desktop, but if you're doing a lot of work authoring
               | in Power BI, then the desktop application is where you
               | want to be for now.
               | 
               | For pure querying or _just_ data model development, there
               | are third party tools that are superior to Power BI
               | Desktop (disclosure: I do work part time with one of
               | these third party vendors).
        
             | sroussey wrote:
             | At least both of these databases support clustered indexes.
             | For decades.
        
             | pphysch wrote:
             | > And finally the GEM that is database projects, you can
             | program your DB changes declaratively, there is nothing
             | like this in the market and again, no one is even trying
             | 
             | Automatic migration tools have essentially been doing this
             | for a while (e.g. Django migrations). But I agree it would
             | be nice if Postgres had better support built in.
        
               | koolba wrote:
               | Database migrations to arrive at a target state are an
               | interesting beast. It's not just the destination that
               | matters, it's how you get there.
               | 
               | The naive approach of applying whatever DDL you need to
               | turn database schema A into B is not necessarily the one
               | anybody would actually want to run, and in larger
               | deployments it's most definitely not the one you'd want
               | to run.
               | 
               | Intermediate availability and data migration go well
               | beyond simply adding tables or columns. They're highly
               | application specific.
               | 
               | The main improvements in this space are making existing
               | operations CONCURRENT (e.g., like index creation) and
               | minimizing overall lock times for operations that cannot
               | (e.g., adding columns with default values).
        
               | pphysch wrote:
               | I agree automating migrations correctly from state A to B
               | without _any_ risk of data loss or downtime is
               | essentially an intractable problem.
               | 
               | Still, the popular tools seem to have good heuristics and
               | good DBAs recognize when migrations must be written by
               | hand.
        
             | Guillaume86 wrote:
             | I fully agree on the sql project type, it's amazing,
             | another great side benefit is intellisense on your db
             | schema (find all references, etc), git blame on the schema
             | is also great. Doesn't redgate offer kinda the same tool
             | commercially with support for more than mssql? Never tried
             | it but IIRC it looked similar...
        
           | 0cf8612b2e1e wrote:
           | A boolean column type.
        
           | marcosdumay wrote:
           | Microsoft is way less likely to sue you and a couple of
           | orders of magnitude cheaper than Oracle.
           | 
           | Besides, managing Microsoft licensing is a bliss close to
           | Oracle's. And yeah, MSSQL is much better in almost every way
           | than Oracle.
           | 
           | If you only compare those two, it's a non-brainier.
        
         | jmull wrote:
         | Well, you can't _necessarily_ trust open source for business
         | software.
         | 
         | The more deeply your business depends on something, the more
         | careful you need to be when selecting the source for that
         | something. (And the db is often very deeply depended on.)
         | 
         | You want to see why their long-term incentives align with your
         | own needs.
         | 
         | But a revenue stream is just one way to do this, and not a
         | perfect one. (Case in point: Oracle.)
         | 
         | In my experience, SQL Server isn't bad though. I know a couple
         | commercial products that started with SQL Server in the late
         | '90s and remain happy with it now. The cost hasn't been a
         | problem and they like the support and evolution of the product.
         | They find they can adopt newer versions and features when they
         | need to without too much pain/cost/time.
         | 
         | (Not that I don't think Postgres is generally a better green-
         | field pick, though, and even more so porting away from Oracle.)
        
         | bityard wrote:
         | I ran into a lot of that 20 years ago, surprised to hear it's
         | still a thing at all given how it's basically common knowledge
         | that most of the Internet and Cloud run on open source
         | software.
         | 
         | I once met an older gentleman who was doing IT work for a
         | defense contractor. He seemed nice enough. We were making small
         | talk and I happened to mention that I had recently installed
         | Linux on my computer at home. He tone changed almost
         | immediately and he started ranting about how Linux was pirated
         | source code, stolen from Microsoft, all of it contains viruses,
         | etc. He was talking about the SCO vs Linux lawsuits but of
         | course got absolutely ALL of the details wrong, like which
         | companies were even involved in the lawsuits. He was so far off
         | the deep end that I didn't even try to correct him, I just
         | nodded and smiled and said I was actually running late to be
         | somewhere else...
        
         | throw351203910 wrote:
         | What your boss doesn't realize is your business already depends
         | on FOSS. Here are a few examples:
         | 
         | - Do you use any cloud provider? Those platforms are built on
         | top of open source software: Linux, nginx (e.g Cloudflare's
         | edge servers before the rust rewrite), ha-proxy (AWS ELB), etc
         | - Either the software your business builds or depends on
         | probably uses open source libraries (e.g: libc, etc) - The
         | programming languages your business uses directly or indirectly
         | are probably open source
         | 
         | My point is that folks that make these kinds of statements have
         | no clue how their software is built or what kind software their
         | business actually depends on.
        
         | baq wrote:
         | mssql is a _great_ rdbms. t-sql is... different... in certain
         | places but all in all if cost isn 't a big issue you really
         | can't go wrong by picking it.
        
       | pestaa wrote:
       | Very impressive changelog.
       | 
       | Bit sad the UUIDv7 PR didn't make the cut just yet:
       | 
       | https://commitfest.postgresql.org/49/4388/
        
         | ellisv wrote:
         | I've been waiting for "incremental view maintenance" (i.e.
         | incremental updates for materialized views) but it looks like
         | it's still a few years out.
        
           | whitepoplar wrote:
           | There's always the pg_ivm extension you can use in the
           | meantime: https://github.com/sraoss/pg_ivm
        
             | ellisv wrote:
             | Unfortunately we use a cloud provider to host our
             | databases, so I can only install limited extensions.
        
               | gregwebs wrote:
               | REFRESH CONCURRENTLY is already an incremental update of
               | sorts although you still pay the price of running a full
               | query.
        
             | isoprophlex wrote:
             | Wow, brilliant! I never knew this existed. Going to try
             | this out tomorrow, first thing!
        
           | JamesSwift wrote:
           | I'm a huge fan of views to serve as the v1 solution for
           | problems before we need to optimize our approach, and this is
           | the main thing that serves as a blocker in those discussions.
           | If only we were able to have v2 of the approach be an IVM-
           | view, we could leverage them much more widely.
        
         | 0cf8612b2e1e wrote:
         | I must be missing something because that feels easy to
         | implement. A date seconds + random data in the same way as
         | UUID4.
         | 
         | Where is the iceberg complexity?
        
           | lfittl wrote:
           | In my understanding it was a timing issue with the UUIDv7 RFC
           | not being finalized before the Postgres 17 feature freeze in
           | early April. Shouldn't be an issue to get this in for
           | Postgres 18, I think.
        
             | 0cf8612b2e1e wrote:
             | That would do it. I was mistakenly thinking v7 was done for
             | months/years at this point.
        
       | miohtama wrote:
       | The titan keeps rocking.
        
       | yen223 wrote:
       | MERGE support for updating views is huge. So looking forward to
       | this
        
         | majkinetor wrote:
         | Now it also supports RETURNS keyword !
        
       | jackschultz wrote:
       | Very cool with the JSON_TABLE. The style of putting json response
       | (from API, created from scraping, ect.) into jsonb column and
       | then writing a view on top to parse / flatten is something I've
       | been doing for a few years now. I've found it really great to put
       | the json into a table, somewhere safe, and then do the parsing
       | rather than dealing with possible errors on the scripting
       | language side. I haven't seen this style been used in other
       | places before, and to see it in the docs as a feature from new
       | postgres makes me feel a bit more sane. Will be cool to try this
       | out and see the differences from what I was doing before!
        
         | ellisv wrote:
         | It is definitely an improvement on multiple
         | `JSONB_TO_RECORDSET` and `JSONB_TO_RECORD` calls for flattening
         | nested json.
        
         | abyesilyurt wrote:
         | > putting json response (from API, created from scraping, ect.)
         | into jsonb column and then writing a view on top to parse
         | 
         | That's a very good idea!
        
         | 0cf8612b2e1e wrote:
         | A personal rule of mine is to always separate data
         | receipt+storage from parsing. The retrieval is comparatively
         | very expensive and has few possible failure modes. Parsing can
         | always fail in new and exciting ways.
         | 
         | Disk space to store the returned data is cheap and can be
         | periodically flushed only when you are certain the content has
         | been properly extracted.
        
           | erichocean wrote:
           | I ended up with the same design after encountering numerous
           | exotic failure modes.
        
       | ksec wrote:
       | With 17, is Vacuum largely a solved issue?
        
         | ses1984 wrote:
         | I'm not up to date on the recent changes, but problems we had
         | with vacuum were more computation and iops related than memory
         | related.
         | 
         | Basically in a database with a lot of creation/deletion,
         | database activity can outrun the vacuum, leading to out of
         | storage errors, lock contention, etc
         | 
         | In order to keep throughput up, we had to throttle things
         | manually on the input side, to allow vacuum to complete.
         | Otherwise throughput would eventually drop to zero.
        
           | sgarland wrote:
           | You can also tune various [auto]vacuum settings that can
           | dramatically increase the amount of work done in each cycle.
           | 
           | I'm not discounting your experience as anything is possible,
           | but I've never had to throttle writes, even on large clusters
           | with hundreds of thousands of QPS.
        
         | jabl wrote:
         | There was a big project to re-architect the low level storage
         | system to something that isn't dependent on vacuuming, called
         | zheap. Unfortunately it seems to have stalled and nobody seems
         | to be working on it anymore? I keep scanning the release notes
         | for each new pgsql version, but no dice.
        
           | qazxcvbnm wrote:
           | I think OrioleDB is picking up from where zheap left things,
           | and seems quite active in upstreaming their work, you might
           | be interested to check it out.
        
         | imbradn wrote:
         | No, vacuum issues are not solved. This will reduce the amount
         | of scanning needed in many cases when vacuuming indexes. It
         | will mean more efficient vacuum and quicker vacuums which will
         | help in a lot of cases.
        
       | ktosobcy wrote:
       | Would be awesome if PostgreSQL would finally add support for
       | seamless major version upgrade...
        
         | ellisv wrote:
         | I'm curious what you feel is specifically missing.
        
           | levkk wrote:
           | pg_upgrade is a bit manual at the moment. If the database
           | could just be pointed to a data directory and update it
           | automatically on startup, that would be great.
        
             | olavgg wrote:
             | I agree, why is this still needed? It can run pg_upgrade in
             | the background.
        
               | vbezhenar wrote:
               | It needs binaries for both old and new versions for some
               | reason.
        
         | dewey wrote:
         | Maybe https://github.com/pgautoupgrade/docker-pgautoupgrade is
         | interesting for you.
        
           | ticoombs wrote:
           | I use this for my Lemmy instance & lemmy-ansible and it's
           | been great! No longer having to support upgrade scripts and
           | write a complete upgrade process[1] for people to follow has
           | made my life a lot easier! Amazing product
           | 
           | - [1] https://github.com/LemmyNet/lemmy-
           | ansible/blob/main/UPGRADIN...
        
       | h1fra wrote:
       | Amazing, it's been a long time since I have been that much
       | excited by a software release!
        
       | monkaiju wrote:
       | PG just continues to impress!
        
       | clarkbw wrote:
       | Some awesome quality-of-life improvements here as well. The
       | random function now takes min, max parameters
       | 
       | SELECT random(1, 10) AS random_number;
        
         | clarkbw wrote:
         | It's never available on homebrew the same day so we all worked
         | hard to make it available the same day on Neon. If you want to
         | try out the JSON_TABLE and MERGE RETURNING features you can
         | spin up a free instance quickly.
         | 
         | https://neon.tech/blog/postgres-17
         | 
         | (note that not all extensions are available yet, that takes
         | some time still)
        
           | dewey wrote:
           | How is your hosted managed Postgres platform an alternative
           | to Homebrew (Installing PG on your local Mac)? There's also
           | many Homebrew formulas that have PG 17 already like
           | (https://github.com/petere/homebrew-
           | postgresql/commit/2faf438...).
        
             | clarkbw wrote:
             | oh, i see I'm getting downvoted; this isn't a sales pitch
             | for my "managed postgres platform".
             | 
             | every year a new postgres release occurs and i want to try
             | out some of the features i have to find a way to get it.
             | usually nobody has it available.
             | 
             | here's the list of homebrew options I see right now:
             | 
             | brew formulae | grep postgresql@ postgresql@10
             | postgresql@11 postgresql@12 postgresql@13 postgresql@14
             | postgresql@15 postgresql@16
             | 
             | maybe you're seeing otherwise but i updated a min ago and
             | 17 isn't there yet. even searching for 'postgres' on
             | homebrew doesn't reveal any options for 17. i don't know
             | where you've found those but it doesn't seem easily
             | available.
             | 
             | and i'm not suggesting you use a cloud service as an
             | alternative to homebrew or local development. neon is pure
             | postgres, the local service is the same as whats in the
             | cloud. but right now there isn't an easy local version and
             | i wanted everyone else to be able to try it quickly.
        
       | paws wrote:
       | I'm looking forward to trying out incremental backups as well as
       | JSON_TABLE.
       | 
       | Thank you contributors!
        
       | andreashansen wrote:
       | Oh how I wish for Postgres to introduce system-versioned (bi-
       | temporal) tables.
        
         | qianli_cs wrote:
         | What's your use case for system-versioned tables? You could use
         | some extensions like Periods that support bi-temporal tables:
         | https://wiki.postgresql.org/wiki/Temporal_Extensions
         | 
         | Or you could use triggers to build one:
         | https://hypirion.com/musings/implementing-system-versioned-t...
        
           | baq wrote:
           | any system of record has a requirement to be bitemporal, it
           | just isn't discovered until too late IME. I don't know if
           | there's a system anywhere which conciously decided to not be
           | bitemporal during initial design.
        
         | jeltz wrote:
         | It will hopefully be in PostgreSQL 18.
        
       | lpapez wrote:
       | Amazing release, Postgres is a gift that keeps on giving.
       | 
       | I hope to one day see Incremental View Maintenance extension
       | (IVM) be turned into a first class feature, it's the only thing I
       | need regularly which isn't immediately there!
        
       | nikita wrote:
       | A number of features stood out to me in this release:
       | 
       | 1. Chipping away more at vacuum. Fundamentally Postgres doesn't
       | have undo log and therefore has to have vacuum. It's a trade-off
       | of fast recovery vs well.. having to vacuum. The unfortunate part
       | about vacuum is that it adds load to the system exactly when the
       | system needs all the resources. I hope one day people stop
       | knowing that vacuum exists, we are one step closer, but not
       | there.
       | 
       | 2. Performance gets better and not worse. Mark Callaghan blogs
       | about MySQL and Postgres performance changes over time and MySQL
       | keep regressing performance while Postgres keeps improving.
       | 
       | https://x.com/MarkCallaghanDB https://smalldatum.blogspot.com/
       | 
       | 3. JSON. Postgres keep improving QOL for the interop with JS and
       | TS.
       | 
       | 4. Logical replication is becoming a super robust way of moving
       | data in and out. This is very useful when you move data from one
       | instance to another especially if version numbers don't match.
       | Recently we have been using it to move at the speed of 1Gb/s
       | 
       | 5. Optimizer. The better the optimizer the less you think about
       | the optimizer. According to the research community SQL Server has
       | the best optimizer. It's very encouraging that every release PG
       | Optimizer gets better.
        
       | netcraft wrote:
       | I remember when the json stuff started coming out, I thought it
       | was interesting but nothing I would ever want to rely on - boy
       | was I wrong!
       | 
       | It is so nice having json functionality in a relational db - even
       | if you never actually store json in your database, its useful in
       | so many situations.
       | 
       | Being able to generate json in a query from your data is a big
       | deal too.
       | 
       | Looking forward to really learning json_table
        
       | openrisk wrote:
       | There was LAMP and then MERN and MEAN etc. and then there was
       | Postgres.
       | 
       | Its not quite visible yet, but all this progres by postgres
       | (excuse the pun) on making JSON more deeply integrated with
       | relational principles will surely at some point enable a new
       | paradigm, at least for full stack web frameworks?
        
         | winrid wrote:
         | The problem with postgres's JSON support is it leads to last
         | write win race conditions compared to actual document stores.
         | Once they can support operations like incrementing a number
         | while someone else updates another field, without locking the
         | row, then maybe.
         | 
         | I don't think you'll see a paradigm shift. You'll just see
         | people using other documents stores less.
        
         | sgarland wrote:
         | We're already in a new paradigm, one in which web devs are
         | abandoning referential integrity guarantees in favor of not
         | having to think about data modeling.
         | 
         | I'd say they're then surprised when their DB calls are slow
         | (nothing to do with referential integrity, just the
         | TOAST/DETOAST overhead), but since they also haven't
         | experienced how fast a DB with local disks and good data
         | locality can be, they have no idea.
        
       | switch007 wrote:
       | Props to all the people (and companies) behind Postgres
       | https://www.postgresql.org/community/contributors/
        
       | Rican7 wrote:
       | Wow, yea, the performance gains and new UX features (JSON_TABLE,
       | MERGE improvements, etc) are huge here, but these really stand
       | out to me:
       | 
       | > PostgreSQL 17 supports using identity columns and exclusion
       | constraints on partitioned tables.
       | 
       | > PostgreSQL 17 also includes a built-in, platform independent,
       | immutable collation provider that's guaranteed to be immutable
       | and provides similar sorting semantics to the C collation except
       | with UTF-8 encoding rather than SQL_ASCII. Using this new
       | collation provider guarantees that your text-based queries will
       | return the same sorted results regardless of where you run
       | PostgreSQL.
        
       | __natty__ wrote:
       | This release is good on so many levels. Just the performance
       | optimisations and the JSON TABLE feature could be entirely
       | separated release, but we got so much more.
        
       | nojvek wrote:
       | I wish postgres supports parquet file imports and exports. COPY
       | command with csv is really slooooooooow. Even BINARY is quite
       | slow and bandwidth heavy.
       | 
       | I wonder how open postgres is and what kind of pull requests
       | postgres team considers? I'd like to learn how to contribute to
       | PG in baby steps and eventually get to a place where I could
       | contribute substantial features.
        
         | ioltas wrote:
         | There has been a patch to extend the COPY code with pluggable
         | APIs, adding callbacks at start, end, and for each row
         | processed: https://commitfest.postgresql.org/49/4681/.
         | 
         | I'd guess that this may fit your purpose to add a custom format
         | without having to fork upstream.
        
       ___________________________________________________________________
       (page generated 2024-09-26 23:01 UTC)