https://www.citusdata.com/blog/2023/07/18/citus-12-schema-based-sharding-for-postgres/ Join us for the Citus 12 Release Party livestream--with demos + Q&A! Skip navigation 8,300 GitHub users starred the Citus repo * Download * Product + Overview + Citus Open Source + Citus on Azure + Comparison Table * Use Cases + Overview + Multi-Tenant Applications + Real-Time Analytics * Resources + Getting Started + Docs + Customer Stories + Newsletters + FAQ + Podcast * About + Contact Us + Our Story + Events + Careers * Updates * Blog * Docs * 8,300 GitHub users starred the Citus repo * Close Menu * Download Citus 12: Schema-based sharding for PostgreSQL Written by Marco Slot July 18, 2023 SHARE THIS POST Share on Twitter Share on LinkedIn Copy link Get our monthly newsletter What if you could automatically shard your PostgreSQL database across any number of servers and get industry-leading performance at scale without any special data modelling steps? Our latest Citus open source release, Citus 12, adds a new and easy way to transparently scale your Postgres database: Schema-based sharding, where the database is transparently sharded by schema name. Schema-based sharding gives an easy path for scaling out several important classes of applications that can divide their data across schemas: * Multi-tenant SaaS applications * Microservices that use the same database * Vertical partitioning by groups of tables Each of these scenarios can now be enabled on Citus using regular CREATE SCHEMA commands. That way, many existing applications and libraries (e.g. django-tenants) can scale out without any changes, and developing new applications can be much easier. Moreover, you keep all the other benefits of Citus, including distributed transactions, reference tables, rebalancing, and more. In this blog post, you'll get a high-level overview of schema-based sharding and other new Citus 12 features: * What is schema-based sharding? * How to use Citus schema-based sharding for Postgres * Benefits of schema-based sharding * Choosing a sharding model for multi-tenant applications (schema-based vs. row-based) * Migrating an existing schema-per-tenant application to Citus * MERGE improvements Even more details available in the release notes on the 12.0 Updates page. And if you want to see demos of some of this functionality, be sure to join us for the livestream of the Citus 12.0 release party on Wed 02 Aug (mark your calendar and join us. Let's dive in! What is schema-based sharding? Schema-based sharding means that tables from the same schema are placed on the same node, while different schemas may be on different nodes. That way, queries and transactions that involve a single schema can always be evaluated efficiently by a single node (read: without network overhead), while the system can transparently scale out to accommodate an arbitrarily large amount of data and high rate of queries across different schemas. Moreover, the cluster can be rebalanced based on disk usage, such that large schemas automatically get more resources dedicated to them, while small schemas are efficiently packed together. So far, Citus primarily focused on "row-based sharding", where tables are hash-distributed across nodes based on the value in the distribution column. Row-based sharding enables arbitrarily large PostgreSQL tables, with parallel distributed queries, query routing based on filters, and distributed DML and DDL. Row-based sharding is very suitable for analytical applications (e.g. IoT, time series) and other scenarios in which you need very large tables. It also works well for multi-tenant applications that keep their data in a set of shared tables, but to be able to distribute your database effectively you to add a "tenant ID" column to all your tables, and include that column in all filters, inserts, foreign keys, etc. You also need to explicitly run create_distributed_table to select a distribution column for each table. If you skip one of these steps, performance might be poor due to network overhead, or you might run into distributed SQL limitations. Schema-based sharding has almost no data modelling restrictions or special steps compared to unsharded PostgreSQL. That makes it easy to build a scalable multi-tenant application with a schema per tenant, or other applications where different parts of your data model can live in different schemas. The main restriction of schema-based sharding is that joins and foreign keys should only involve tables from the same schema (or reference tables), but transactions across schemas still work fine. That makes schema-based sharding much easier to use than row-based sharding, though it only makes sense for specific workload patterns. For applications that can use either model (e.g. multi-tenant apps), a downside of schema-based sharding is the need to manage many tables and performance overhead. A more detailed comparison is given below. How to use Citus schema-based sharding for Postgres You can easily get started with schema-based sharding by enabling the citus.enable_schema_based_sharding setting. When enabled, any schema you create will become a "distributed schema". -- Enable schema-based sharding! set citus.enable_schema_based_sharding to on; -- Create distributed schemas for two of my tenants create schema tenant1; create schema tenant2; -- See my distributed schemas select * from citus_schemas; schema_name | colocation_id | schema_size | schema_owner -------------+---------------+-------------+-------------- tenant1 | 3 | 0 bytes | marco tenant2 | 4 | 0 bytes | marco (1 row) Any table created in the tenant1 or tenant2 schema will now automatically become a "single shard" table that is "co-located" with all other tables in the schema. Compared to hash-distributed tables in Citus, single shard tables do not have a distribution column. There is only one shard per table, and all shards in the same schema live on the same node. -- Create single shard tables in schema tenant1 (automatically co-located) set search_path to tenant1; create table note_categories ( category_id bigserial primary key, category_name text not null); create table notes ( note_id bigserial primary key, category_id bigint references note_categories (category_id), message text not null); -- Create single shard tables in schema tenant2 (automatically co-located) set search_path to tenant2; create table note_categories ( category_id bigserial primary key, category_name text not null); create table notes ( note_id bigserial primary key, category_id bigint references note_categories (category_id), message text not null); You can easily see where the tables live in the citus_shards view select table_name, shardid, colocation_id, nodename, nodeport, shard_size from citus_shards where citus_table_type = 'schema'; table_name | shardid | colocation_id | nodename | nodeport | shard_size ------------------------+---------+---------------+-----------+----------+------------ tenant1.note_categories | 102310 | 3 | wrk1.host | 5432 | 16384 tenant1.notes | 102311 | 3 | wrk1.host | 5432 | 16384 tenant2.note_categories | 102312 | 4 | wrk2.host | 5432 | 16384 tenant2.notes | 102313 | 4 | wrk2.host | 5432 | 16384 (4 rows) Now, any query that involves only tables from the same schema will be transparently delegated to the right node. You can either set the search_path or use fully qualified names in your queries: -- perform a query on different tenants using search_path and relative paths set search_path to tenant1; select * from notes join note_categories using (category_id); set search_path to tenant2; select * from notes join note_categories using (category_id); -- perform a query on different tenants using fully-qualified names select * from tenant1.notes join tenant1.note_categories using (category_id); select * from tenant2.notes join tenant2.note_categories using (category_id); You can run these queries from any node. We are also improving PgBouncer to handle search_path correctly when using schema-based sharding. That's pretty much it. To use schema-based sharding in Citus 12 or later, you create multiple schemas, and use search_path or fully qualified names to switch between schemas when creating tables or running queries. The schemas are transparently spread across nodes (starting on a single node) and can be rebalanced. No additional data modelling steps (like create_distributed_table) required! In a follow up blog post, we'll dive deeper into different usage scenarios and schema management. Benefits of schema-based sharding in Citus and Postgres Schema-based sharding gives you an easy path for sharding if your application and database model fits into one of these classes: Multi-tenant applications (schema per tenant): Software as a service (SaaS) applications serve many different tenants from a single deployment, often backed by a single database for ease-of-management and cost efficiency. Most queries done by these SaaS applications can be scoped to a single tenant, which makes the tenant a natural shard key. You can shard your Citus database by creating a schema per tenant, as an alternative to distributing tables by a tenant ID column. schema per tenant app diagram Figure 1: Schema per tenant application where each tenant has its own schema, typically with the same set of tables Microservices (schema per microservice): Many cloud-native applications are broken down into microservices that are developed and deployed separately. Each microservice has its own state, which is usually quite small, and simple. Rather than creating and managing a separate database system for each microservice, it is useful to consolidate the state of different microservices into a single distributed database. That way, you have a single place to manage and query all data, can share data and perform transactions across microservices, and can minimize cost by sharing resources, without the database becoming a bottleneck. schema per microservice app diagram Figure 2: Schema per microservice application, with each microservice having its state in a separate schema in the same distributed database. Vertical partitioning (schema per table group): The databases of complex OLTP applications often involve many different tables, and not all of them are closely related. "Vertical partitioning" refers to the practice of sharding your database into groups related tables with each group living on its own database server. With schema-based sharding, you can easily achieve this or prepared for it upfront by assigning each group to its own schema and scale out only when necessary (and avoid all the growing pains). vertical partitioning diagram Figure 3: Vertical partitioning with groups of related tables combined in a schema Within a distributed schema, you can use arbitrary joins and foreign keys. Transactions that remain scoped to a schema will be efficiently delegated to the node that stores the schema, which minimizes overhead and gives you PostgreSQL performance characteristics at any scale. Distributed transactions across schemas are still possible, with some additional network round trips. Schema-based sharding offers many other benefits over manual, application-level sharding: * Very easy shard management: Just CREATE/DROP SCHEMA, and Citus does the rest. * Automatic rebalancing of schemas across nodes without downtime. * Share data across schemas using reference tables with support for local joins and foreign keys. * Tenant-level statistics, which has been expanded in Citus 12 to treat schemas as tenants. * Mixed distributed database modes, e.g. multi-tenancy apps where small tenants live in a shared table and large tenants have their own distributed schema. Or ultra-large tenants / microservices live in a regular schema with distributed tables, while others have a distributed schema. * Transparent stored procedure call delegation by co-locating with a table in a distributed schema. * Can manage all schemas via transactions. And with all the flexibility that PostgreSQL and Citus already offer, the overall list of benefits can go on for a while. Of course, everything in distributed databases is ultimately still a trade-off. One thing to consider is that having a very large number of schemas (or rather, tables) can create certain performance issues in PostgreSQL. For instance, each process keeps a separate catalog cache, which can cause high memory consumption when there are many tables. We recommend carefully considering your sharding model if you expect to have over 10k schemas. Queries and transactions that span multiple schemas may also be slower than on a single PostgreSQL node, and may incur certain SQL limitations. Choosing a sharding model for multi-tenant SaaS applications Schema-based sharding is only applicable if your application naturally divides into distinct groups of tables. For applications with a single large table (e.g. IoT measurements), row-based sharding is the obvious choice. However, for multi-tenant applications, row-based sharding (by tenant ID) and schema-based sharding (with a schema per tenant) are both very reasonable options, with different characteristics. schema-based vs row-based sharding Figure 4:Side-by-side comparison of Schema-based sharding vs. Row-based sharding. In this diagram, the same colors are used on both sides of the diagram to depict data for each of the 5 tenants (green for tenant1, blue for tenant2, yellow for tenant3, grey for tenant4, orange for tenant5)--so you can visually see how the tenant data is distributed in a schema-based sharding model vs. row-based sharding. Each of these models has different pros and cons to consider: Schema-based sharding: Each tenant has a separate schema with its own set of tables, in the same database. Tables do not have any restrictions with regards to constraints or foreign keys, except that foreign keys should not cross the schema boundary, unless they are foreign keys to reference tables. Schema-based sharding works especially well to scale applications that have many large tenants but might see performance degradation with a very large number of tenants (>10k). Row-based sharding: The data from all tenants is in the same set of tables. Each table has a tenant ID column (or equivalent) which acts as the distribution column. Tables are co-located such that the same tenant ID lives on the same node, across different tables. Primary keys, foreign keys, joins, and filter should include the tenant ID column to ensure operations can be evaluated locally. Very large tenants should be isolated to their own shard. A high-level comparison of these sharding models (in the context of multi-tenant applications) is given below: Schema-based Row-based sharding sharding Multi-tenancy Separate schema Shared tables with tenant ID columns model per tenant Citus version 12.0+ All versions Additional Use create_distributed_table on each steps compared None, only a table to distribute & co-locate to vanilla config change tables by tenant ID PostgreSQL Number of 1-10k 100-1M+ tenants No foreign keys Need to include a tenant ID column Data modelling across and use it as the a distribution requirement distributed column in each table, and include it schemas. in primary keys, foreign keys. SQL requirement Use a single for single node distributed Joins and WHERE clauses should queries schema per include tenant_id column query. Parallel cross-tenant No Yes queries Custom table definitions per Yes No tenant Access control Schema Row-level security permissions Yes, using Data sharing reference tables Yes, using reference tables across tenants (in a separate schema) Every tenant has Can give specific tenant IDs their Tenant to shard its own shard own shard group via isolate_tenant_ isolation group by to_new_shard definition Which model to choose depends on your requirements. The biggest benefit of schema-based sharding is ease-of-use; no additional data modelling steps are required, only a single setting change. If you have a smaller number of large tenants (B2B), and some require a custom table definition or permissions, then schema-based sharding is also a great fit. If you have a very large number of small tenants (B2C) and want to simplify schema management and cross-tenant queries, then row-based sharding is likely to be a better fit. If you need both, then consider that nothing prevents you from using one set of distributed tables for small tenants with row-based sharding, and separate distributed schemas for large tenants. Migrating an existing schema-per-tenant application to Citus One more thing... if you already have an existing PostgreSQL application that uses a schema per tenant, then you can easily migrate into a Citus cluster that uses schema-based sharding using pgcopydb, a powerful open-source database migration tool created by my colleague Dimitri Fontaine. To see it in action, make sure citus.enable_schema_based_sharding = on is in your postgresql.conf and then run: pgcopydb clone --source "host=source.host" --target "host=target.host" --restart --drop-if-exists --skip-extensions You can also use the --follow option to replay writes that happen during the migration to avoid downtime. You should not make schema changes or create/drop schemas during the migration. We do recommend extensively testing your application before switching over, and reporting any issues via the Citus GitHub repo. Extending the MERGE superpower in Citus 12 Schema-based sharding is super exciting and the biggest enhancement in Citus 12, but we also continue to improve Citus for other scenarios, including row-based multi-tenancy and Internet-of-things (IoT) scenarios. PostgreSQL 15 introduced the MERGE command, which is especially useful when you receive a batch of records and you not only want to insert new records, but also update or delete existing records. For instance, consider the following IoT scenario where we want to maintain a table of "active alerts" based on incoming measurements: create table alerts ( alert_id bigserial primary key, metric text not null, upper_bound double precision not null); create index on alerts (metric); select create_reference_table('alerts'); create table active_alerts ( alert_id bigint not null, device_id bigint not null, alert_start_time timestamptz not null, max_value double precision, primary key (alert_id, device_id)); -- distributed by alert_id for fast lookup of all devices that are exceeding the threshold for a particular alert select create_distributed_table('active_alerts', 'alert_id'); create table measurements ( measurement_id uuid not null default uuid_generate_v4(), device_id bigint not null, payload jsonb not null, measurement_time timestamptz default now(), primary key (device_id, measurement_id)); -- distributed by device_id_id for fast grouping by device_id select create_distributed_table('measurements', 'device_id'); insert into alerts (metric, upper_bound) values ('temperature-9177', 38.0); insert into measurements (device_id, payload) values (1, '{"value":37.5, "metric":"temperature-9177"}'); insert into measurements (device_id, payload) values (1, '{"value":38.5, "metric":"temperature-9177"}'); insert into measurements (device_id, payload) values (1, '{"value":38.6, "metric":"temperature-9177"}'); insert into measurements (device_id, payload) values (1, '{"value":38.7, "metric":"temperature-9177"}'); We want to periodically register all devices for which the last 3 measurements exceeded a particular threshold. When the last 3 measurements were all below the threshold, we want to delete it. We can do that using a MERGE command: /* run every minute using pg_cron */ select cron.schedule('periodic-merge', '* * * * * ', $$ /* lets have some fun with (distributed) SQL */ merge into active_alerts aa using ( /* select minimum and maximum value across last 3 measurements, per device */ select alert_id, device_id, min(value) as min_value, max(value) as max_value, upper_bound, max(measurement_time) as measurement_time from ( /* get time-ordered measurements by device (latest is row_number 1) */ select device_id, payload->>'metric' as metric, (payload->>'value')::double precision as value, measurement_time, row_number() over(partition by device_id order by measurement_time desc) as row_number from measurements ) meas join alerts on (meas.metric = alerts.metric) where row_number <= 3 group by alert_id, device_id ) m on aa.alert_id = m.alert_id AND aa.device_id = m.device_id /* if the threshold is broken and the alert is not yet active, insert it */ when not matched and m.min_value > m.upper_bound then insert values (m.alert_id, m.device_id, m.measurement_time, m.max_value) /* if the threshold is broken and the alert is already active, update it */ when matched and m.max_value > aa.max_value then update set max_value = m.max_value /* if the threshold is no longer broken but the alert is still active, delete it */ when matched and m.max_value <= m.upper_bound then delete; $$); What's new in Citus 12 is that this MERGE works even though the measurements table and the active_alerts table are not co-located (their distribution columns do not match). The MERGE will re-partition the data across the cluster on the fly, in one parallel, distributed transaction. That makes MERGE the most advanced distributed database command available in Citus. If you use MERGE in combination with schema-based sharding, then it will be fully pushed down to the node that stores the schema and handled by PostgreSQL. Start building scalable apps Schema-based sharding in Citus 12 gives you a new and easy way to scale out your PostgreSQL database. Whether you are an ISV building a multi-tenant SaaS application, microservices, or a complex OLTP workload that can use vertical partitioning, you can now easily tell Citus how to distribute the database by grouping tables into schemas without any special syntax. There are various application development libraries that will help you do this, but you can also easily build it yourself through simple commands like SET search_path TO or using variables schema names in your queries. To learn even more about what Citus 12.0 can do for you: * Check out the 12.0 Updates page if you want to read through the detailed release notes * Join the Livestream of the Citus 12.0 Release Party, with demos: The livestream will include a demo of schema-based sharding and is happening on Wed 02 Aug @ 9:00am PDT (mark your calendar.) And if you want to get started distributing Postgres with Citus, the links in the bullets below should be useful: * Citus open source repo on GitHub * Getting started page for Citus * Download Citus open source * Citus open source docs * Join the Citus Public Slack * Quickstart docs for Citus on Azure, now known as Azure Cosmos DB for PostgreSQL... Since Citus 12.0 is just released, it is not yet available on Azure but it will be soon. Oh, and there is a free trial, too. * Citus * Citus release notes * distributed databases * distributed Postgres * multi-tenant * open source * popular * Postgres * Postgres extensions * sharding Enjoy what you're reading? If you want to read more posts from our Citus database and Postgres teams, sign up for our monthly newsletter and get the latest content delivered straight to your inbox. SHARE THIS POST Share on Twitter Share on LinkedIn COPY LINK Marco Slot Written by Marco Slot Lead engineer for the Citus database engine at Microsoft. Speaker at Postgres Conf EU, PostgresOpen, pgDay Paris, Hello World, SIGMOD, & lots of meetups. Talk selection team member for Citus Con: An Event for Postgres. PhD in distributed systems. Loves mountain hiking. @marcoslot marcocitus * Citus * Citus release notes * distributed databases * distributed Postgres * multi-tenant * open source * popular * Postgres * Postgres extensions * sharding Get our monthly newsletter straight to your inbox POPULAR POSTS * Citus 11 for Postgres goes fully open source, with query from any node By Marco Slot * Ultimate Guide to Citus Con: An Event for Postgres, 2023 edition By Claire Giordano * Debugging PostgreSQL CI failures faster: 4 tips By Nazir Bilal Yavuz * Postgres 15 available in Azure Cosmos DB for PostgreSQL By Nik Larin * Debugging Postgres autovacuum problems: 13 tips By Samay Sharma LATEST POSTS SHARE THIS POST Share on Twitter Share on LinkedIn Copy link Get our monthly newsletter Related Content * What's new in Citus 11.3 & Postgres for multi-tenant SaaS workloads By Marco Slot Read more * Auto scaling Azure Cosmos DB for PostgreSQL with Citus, Grafana, & Azure Functions By Lucas Borges Fernandes Read more * Distributed PostgreSQL benchmarks using HammerDB, by GigaOM By Marco Slot Read more * Resources * Getting Started * Docs * Release Updates * Customer Stories * Support * Podcast * Blog * Technology * Citus Overview * Open Source * Citus on Azure * Citus Enterprise * Use Cases * Download * About * Contact Us * Our Story * Events * Careers * Newsroom * Pricing * FAQ * Community * GitHub repo * Citus Slack * Stack Overflow * Newsletters * Citus Con * Path to Citus Con * Subscribe to our newsletter * * * Join our Slack! * GitHub Stack Overflow Twitter Slack YouTube RSS feed Citus Data is now part of Microsoft logo Privacy Statement (c)2023 Citus Data, a Microsoft Company. All rights reserved. This site uses cookies for analytics, personalized content and ads. By continuing to browse this site, you agree to this use. Learn more. CLOSE