https://www.crunchydata.com/blog/a-look-at-postgres-15-merge-command-with-examples Crunchy DataCrunchy Data Products Featured Products Crunchy Bridge Fully managed cloud Postgres available on AWS, Azure, & GCP. Crunchy Postgres Integrated high availability PostgreSQL solution for enterprises with always on requirements. Crunchy Postgres for Kubernetes Crunchy Postgres for Kubernetes Crunchy Hardened Postgres Hardened, secure, access controlled PostgreSQL to meet advanced security requirements. All Crunchy Products Crunchy Certified PostgreSQL Crunchy Postgres for Cloud Foundry Crunchy MLS Postgres Crunchy Spatial Cloud Crunchy Bridge Overview Documentation Pricing Get Started Cloud Partners AWS Azure GCP Heroku Create your accountLogin Contact us Solutions Industries Government Healthcare Finance SaaS Automotive Blockchain Software Consultancies Use Cases Internal Database-as-a-Service Embedded PostgreSQL Edge Computing PostGIS Integrations Ansible Tower JFrog Kong OpenShift Developers Overview Postgres Operator Fully Managed Postgres Postgres Tutorials Postgres Tips Why Postgres? Value of a Subscription Customers BlogContact UsDownload Login Access PortalCrunchy Bridge Open menu Postgres Tutorials A Look at Postgres 15: MERGE Command with Examples Jean-Paul ArgudoJean-Paul Argudo Jean-Paul Argudo Oct 14, 2022*4 min read With PostgreSQL 15 comes a new SQL Command called MERGE. MERGE has been in the SQL standard for quite a while, but it just made its way into the PostgreSQL codebase. Prior to MERGE, if you wanted to refresh a target table from a source table, prior to Postgres 15, you could use the "upsert" method with the ON CONFLICTclause. Now, MERGE can be used instead! Some situations where MERGE makes a lot of sense are: * data loading from external sources, thru foreign data wrappers * staged and batched process jobs About MERGE Let's look at the synopsis in the documentation: [[ WITH with_query [, ...] ] MERGE INTO target_table_name [ [ AS ] target_alias ] USING data_source ON join_condition when_clause [...] where data_source is: { source_table_name | ( source_query ) } [ [ AS ] source_alias ] and when_clause is: { WHEN MATCHED [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } | WHEN NOT MATCHED [ AND condition ] THEN { merge_insert | DO NOTHING } } and merge_insert is: INSERT [( column_name [, ...] )] [ OVERRIDING { SYSTEM | USER } VALUE ] { VALUES ( { expression | DEFAULT } [, ...] ) | DEFAULT VALUES } and merge_update is: UPDATE SET { column_name = { expression | DEFAULT } | ( column_name [, ...] ) = ( { expression | DEFAULT } [, ...] ) } [, ...] and merge_delete is: DELETE We can read that the source of the data_source can be either a table or some data brought to the command thanks to the prior WITH with_query, or inside the MERGE after USING source_query. When the data matches, i.e. there's a matching record in the target_table_name, we can specify what to do: * do an UPDATE as defined in the merge_update (see UPDATE SET...) part or * do a DELETE as defined in the merge_delete or * do nothing When the data doesn't matches, i.e. there's no matching record in the target_table_name, then we do an INSERT (see merge_update). MERGE example - remote sensors & batches In working with clients, I have seen a need for data loading with remote sensors or stations. Anything "sending regularly data" and loading that data into a dataset is a common need. I'm particularly excited about using MERGE to solve these issues for Postgres users. In order to explain this data loading use case of MERGE, I'm going to set up an example. We'll have a database of stations that's a remote measurement tool with data coming in intermittently. * Streaming data, data arriving continuously * Stations emit periodic and intermittent data measures * There's a batch collecting this data * station_data_new is a temp table for common storage * Data is stored long term station_data_actual with the last possible values * We want to keep track of when the station has been created Let's create some tables for testing purpose. Beware that the temporary table station_data_new will only exist in the context of a given session. create temporary table station_data_new ( station_id integer , a integer , b integer ); create table station_data_actual ( station_id integer primary key , a integer , b integer , created timestamp default current_timestamp , updated timestamp default current_timestamp ); Let's create some sample data into station_data_new: our 1st 5 stations are up and sent the 1st batch of data: with measures as ( select * from generate_series(1,5) ) insert into station_data_new ( station_id , a , b ) select generate_series , round(random()*100+1) , round(random()*100+1) from measures; Basic MERGE At this point, we could just do a plain INSERT with SELECT from station_data_new to station_data_actual. Instead we'll use MERGE because we're planning for cases when the data already exists, an UPDATE will be issued and not an INSERT. merge into station_data_actual sda using station_data_new sdn on sda.station_id = sdn.station_id when matched then update set a = sdn.a, b = sdn.b, updated = default when not matched then insert (station_id, a, b) values (sdn.station_id, sdn.a, sdn.b); If you execute it once, you'll have in return MERGE 5, and the data is inserted into station_data_actual, where the timestamps in created and updated have the same value. To do even more testing you could: * truncate table station_data_new; * Re-create data in it, by changing generate_series(1,5) with generate_series(1,10) * Issue again the very same MERGE as we ran before You'll see the data in station_data_actual updated. Stations 1 to 5 will have updated status and stations 6 to 10 will be created and updated. Conclusion MERGE was a long awaited feature for PostgreSQL fans! Now we have it. I encourage you to review your processes when it's about merging existing data in your database. In this example, new data entering with existing data can be a good place to use this. MERGE opens new usages of aggregating and/or merging data from many databases. In distributed models, where all data exists across different locations, using foreign data wrappers with MERGE could be a really elegant solution. MERGE may simplify your processing of data all along the database's life. With strong and fast SQL statements as opposed to functions or other complex operations. Enjoyed articles like this? Get more delivered to your inbox Do not fill this out please: [ ]Do not fill this out please, it will be pre-filled: [ ] Email address[ ] Join The List Jean-Paul Argudo Written by Jean-Paul Argudo October 14, 2022 *More by this author Footer Products * Crunchy Postgres * Crunchy Postgres for Kubernetes * Crunchy Bridge * Crunchy Certified PostgreSQL * Crunchy PostgreSQL for Cloud Foundry * Crunchy MLS PostgreSQL * Crunchy Spatial Services & Support * Enterprise PostgreSQL Support * Ansible Tower * Red Hat Partner * Trusted PostgreSQL * Crunchy Data Subscription Resources * Customer Portal * Software Documentation * Blog * Events Company * About Crunchy Data * Team * News * Careers * Contact Us * Newsletter * Security Crunchy Data Newsletter Subscribe to the Crunchy Data Newsletter to receive Postgres content every month. Do not fill this out please: [ ]Do not fill this out please, it will be pre-filled: [ ] Email address[ ] Join The List YouTubeLinkedInTwitterGitHub (c) 2018-2022 Crunchy Data Solutions, Inc. This site uses cookies for usage analytics to improve our service. By continuing to browse this site, you agree to this use. Learn more *