[HN Gopher] Realtime Postgres Row Level Security
___________________________________________________________________
Realtime Postgres Row Level Security
Author : kiwicopple
Score : 119 points
Date : 2021-12-01 15:00 UTC (8 hours ago)
(HTM) web link (supabase.com)
(TXT) w3m dump (supabase.com)
| polycaster wrote:
| What are approaches to you prevent DoS attacks with such an
| architecture? It seems trivial for an attacker to generate super
| expensive queries en masse.
|
| Strict row limits could bring some relief but don't solve the
| problem.
|
| I bet there are solutions for this but it's not very obvious.
| Would someone briefly explain?
| tehnorm wrote:
| We currently make use of Supabase and it's been fantastic. It's
| enabled us to completely get away from having a traditional
| backend/API by utilizing RLS and it's realtime nature to have
| data directly in the UI. Rough numbers are showing we cut
| development time by a third vs a traditional approach.
| iwebdevfromhome wrote:
| When I think of postgrest, supabase and other tools that allows
| you skip the backend completely and go straight to the DB; is
| how do you handle business logic that doesn't make sense to
| have in either the frontend or the DB ?
| thelastbender12 wrote:
| I have the same question. Since Supabase still exposes the
| connection parameters to the database, we could spin up
| another Flask server to deal with routes that are not
| expressible as simple CRUD operations. The JWT secret shared
| by the auth service (GoTrue) with Postgrest is available too,
| so I could also use it to authenticate the users myself.
|
| @supabase developers, is this a reasonable way to do things
| or am I missing something?
| polskibus wrote:
| You can do business logic in stored procedures, functions and
| views.
| michelpp wrote:
| My go-to approach is to insert jobs into a queue table, and
| then have backend workers that consume items from the queue.
| This has a number of advantages:
|
| 1. Faster user experience, the user isn't waiting for the
| business logic to complete, inserting in the queue should
| only take a couple of milliseconds.
|
| 2. It's more secure, the web worker can have minimal
| privileges (INSERT only on the queue table) but the backend
| workers can have much more privilege because they are not
| user facing.
|
| 3. You can scale the web workers orthogonal to the queue
| workers, as they'll likely have very different scaling
| properties.
|
| Supabase also has a WAL decoding tool called WALRUS that I
| have not tried yet, and that could be the most efficient
| approach going forward. The tradeoff with queue tables is
| that the tables are persistent and a bit more resilient to
| failure/retry.
| cpursley wrote:
| Do you have more info on the WALRUS stuff? Is that part of
| the Elixir lib?
| cpursley wrote:
| I actually extracted out the WAL listening bit of code from
| Supabase and use that to listen to changes in Postgres then
| do callback style business logic in Elixir. If you like
| Elixir, this could be an option.
|
| I haven't had time to work on it myself until recently, but I
| brought up the idea of making that bit a separate library
| (and I understand that the Supabase folks are super busy).
|
| https://github.com/supabase/realtime/issues/146#issue-874855.
| ..
| pgroves wrote:
| There are lots of simple things that are normally easier to
| do in the web framework that are suddenly easier to do in the
| database (with the side effect that you can do DB
| optimizations much easier).
|
| But the other consideration is that you likely need to do a
| lot with a reverse-proxy like traefik to have much control of
| what you are really exposing to the outside world. PostgREST
| is not Spring, it doesn't have explicit control over every
| little thing so you're likely to need something in front of
| it. Anyway, point is that having a simple Flask server with a
| few endpoints running wouldn't complicate the architecture
| very much b/c you are better off with something in front of
| it doing routing already (and ssl termination, etc).
| pgroves wrote:
| I'm on a POC project that's using PostgREST and it's been
| extremely fast to get a big complicated data model working with
| an API in front of it. But I guess I don't get how to really
| use this thing in reality? What does devops look like? Do you
| have sophisticated db migrations with every deploy? Is all the
| SQL in version control?
|
| I also don't really get where the users get created in postgres
| that have all the row-level permissions. The docs are all about
| auth for users that are already in there.
| michelpp wrote:
| This is my personal experience with using PostgREST (I
| haven't had the full supabase experience yet):
|
| > What does devops look like?
|
| I usually spin PostgREST workers up in some kind of managed
| container service, like Google Compute Engine. PostgREST is
| stateless, so other than upgrades, you never really need to
| cycle the services. As for resources PostgREST is extremely
| lean, I usually try to run 4 to 8 workers per gigabyte of
| RAM.
|
| > Do you have sophisticated db migrations with every deploy?
|
| You can use whatever migration tool your want. Sqitch is
| quite popular. I've even worked on projects that were
| migrated by Django but PostgREST did the API service.
|
| > Is all the SQL in version control?
|
| Yes this is a good approach, but it means needing a migration
| tool to apply the migrations in the right order, this is what
| Sqitch does and many ORMy libraries have migration sort of
| half-baked in.
|
| It's worth noting that because many of the objects that
| PostgREST deals with are views, which have no persistent
| state, the migration of the views can be _decoupled_ from the
| migration of the persistent objects like tables. Replacing a
| view (with CREATE OR REPLACE VIEW) can be done very quickly
| without locking tables as long as you don 't change the
| view's schema.
| kiwicopple wrote:
| In Supabase we use a separate Auth server [0]. This stores
| the user in an `auth` schema, and these users can login to
| receive a JWT. Inside the JWT is a "role", which is, in fact,
| a PostgreSQL role ("authenticated") that has certain grants
| associated to it, and the user ID (a UUID).
|
| Inside your RLS Policies you can use anything stored inside
| the JWT. My cofounder made a video [1] on this which is quite
| concise. Our way of handling this is just an extension of the
| PostgREST Auth recommendations:
| https://postgrest.org/en/v9.0/auth.html
|
| [0] Auth server: https://github.com/supabase/gotrue
|
| [1] RLS Video: https://supabase.com/docs/learn/auth-deep-
| dive/auth-row-leve...
| kiwicopple wrote:
| Hey HN, Supabase is an open source Firebase alternative. We're
| building the features of Firebase using enterprise-grade open
| source tools. We're particularly focused on scalability. We take
| proven tools like Postgres, and we make them as easy to use as
| Firebase.
|
| Today, Supabase is adding Row Level Security (RLS) to our
| Realtime engine.
|
| The linked blog post goes into depth around the technical
| implementation, so I'll just give a recap for the people who jump
| straight to comments (like me).
|
| Supabase was launched here on HN when we open sourced our
| Realtime engine[0] - an Elixir server which clients (i.e. website
| visitors/users) can connect to via websockets and receive a
| stream of PostgreSQL changes.
|
| The server receives those changes via a logical replication slot
| - the same system that PostgreSQL uses for replicating to other
| databases.
|
| To achieve RLS we added a few SQL functions, the main one is
| apply_rls[1] which the stream is filtered through. For every user
| connected to the Elixir server, the Postgres function checks if
| they have access to the database change and appends an array of
| allowed user IDs. The Realtime server then delivers the change to
| the user only if the connected user is matched in this array.
|
| This one has been a long time coming, and it's one of the reasons
| why we have maintained our "beta" badge for so long. A few of the
| team will be here to answer any questions - my cofounder @awalias
| and @steve-chavez from PostgREST, @inian, @wenbo and @1_over_n
|
| [0] Realtime Show HN:
| https://news.ycombinator.com/item?id=22114560)
|
| [1] SQL function:
| https://github.com/supabase/realtime/blob/master/server/priv...
| BrandiATMuhkuh wrote:
| I just want so say thank you. RLS is the feature I was waiting
| for. It's an impressive work you have done!
| me_me_mu_mu wrote:
| Looks really interesting, looking at your docs. I have an
| application using firebase + geofire where users can see things
| on a map, and those results are themselves updated (ex. real
| time reactions). However, geofire limits me heavily because I
| can't add additional filtering (ex. timestamp) and lack of bbox
| queries.
|
| Can I use supabase's postgres + postgis setup, and support the
| same type of real time functionality where I can get a bbox
| result set to display on a map, and then when one of those
| items gets a reaction or something from some user, that change
| should propagate to anyone else who might have that particular
| item in their current map bbox result set. Just wondering if
| you know if that's possible. Thanks
| golemiprague wrote:
| Looking at Hasura they provide also column level security and
| option to add custom logic via actions. Is this something
| planned in the future and in general how would you compare the
| two systems and what are the advantages or disadvantages of
| Supabase?
| Ericson2314 wrote:
| The row level security people should talk to the capability
| security people.
|
| Probably _the_ main problem with RDBMS is also it 's biggest
| benefit: you write code that naively looks some bad polynomial
| time, and then query planner does the magic and use indices. The
| problem is, of course, when the index doesn't exist or the query
| planner doesn't do a good job.
|
| What would be nice is a way to _mandate_ that queries must go
| through certain indices. But this is in fact good for row level
| security too!
|
| Functional programming is deeply related to capability-based
| security, in that both try to get rid of all ambient authority
| and mediate all actions through some sort of reference. Rust,
| with it's many types of reference makes the relationship even
| clearer. From these vantage points, it is very natural to use
| data structure both to make things performant _and_ enforce
| certain invariant.
|
| Indices are the closes data structures in RDBMSs, because they do
| promote only certain access patterns. With the rule that queries
| must go through the indices, they enforce those access pattern,
| and thus we have the same proven tool. Combine them with views
| for filtering out data, (which we can think in conjunction with
| more indices of as sort of a curried indexes Map<UserID,
| Map<Stuff, OtherStuff>> even if they aren't implemented that
| way), and now we have very powerful tools for guaranteeing
| performance and security.
|
| As an icing on the cake, I think enforcing limited access
| patterns could get us closer to that long-sought goal of a RMBDS
| that automatically knows how to distribute itself. The limited
| access patterns open up possible options, because we which of the
| pathological joins that every sharding possibility run into are
| no longer allowed. Reusing the same query planner tricks,
| "cardinality analysis" type stuff can be used to choose between
| those possibilities.
|
| Here's a worked out concrete example:
|
| Tables: User: { k: UserID PK, t: TennentId, n:
| Name, isAdmin: Bool }, Todo: { k: TodoID PK, u: UserID, d:
| Date, m: Msg }
|
| Indices and Views:
|
| # exists_classical means proof of existence but we don't care
| which. Map<(t: TennentId, exists_classical u:
| User. u.t = t, u.isAdmin = true), {u2 <- User, u2.t = t}>
| Map<(t: TennentId, exists_classical u: User. u.t = t, u.isAdmin =
| true), {t <- Todo, u2 <- User, t.u = u2.k, u2.t = t}>
|
| ^ that's a joined view with compression syntax
|
| # exists_constructive means need to show existence and we do see
| what item witness it Map<t: TennantID, Map<uk:
| UserID, exist_constructive u: User. u.t = t, u.k = uk>>
| Map<(t: TennentId, Map<uk: UserId, exists_classical u: User. u.t
| = t, u.k = uk, {t <- Todo, t.u = u.k}>>
|
| The more worked out system would probably allow tupling stuff to
| not repeat the "Map<(t: TennentId," at the beginning.
|
| A few things to note:
|
| 1. When we do a comprehension syntax, we are "returning the whole
| table" i.e. an unconstrained view that can be queried in an
| arbitrary way. A joined table has to be joined in the given way,
| but is otherwise unconstrained.
|
| 2. exist_constructive is the same as a { ... } comprehension, but
| the cardinality must be 1 as proven by the unique constraints.
| exists_classical is a comprehension for an arbitrary number of
| stuff, that is then _truncated_ in the type theory sense.
| https://xenaproject.wordpress.com/2021/05/19/the-trace-of-an...
| is a good discussion of an example of this truncation.
|
| 3. Normally one puts the TennentID on every table, even though it
| is denormalizing. Now the access patterns enforce this.
|
| Clearly this is a few PhD theses away from actually existing :),
| but all the embryonic theory comes from good well-established
| things, which makes me confident.
| steve-chavez wrote:
| > What would be nice is a way to mandate that queries must go
| through certain indices.
|
| Yes, this is an interesting idea. We're kinda gravitating
| towards this approach in PostgREST lately. For example, on a
| recent discussion about integrating with PGroonga[1], we were
| thinking of only exposing its operators once an index is
| enabled on a column: CREATE INDEX index_name ON
| table USING pgroonga (column);
|
| Enables doing: GET
| /table?column=groonga_query.PostgreSQL
|
| Which roughly translates to: SELECT * FROM
| table WHERE column &@~ 'PostgreSQL';
|
| Of course this is PostgREST-specific, only when going through
| it you'd enforce this restriction.
|
| [1]: https://github.com/PostgREST/postgrest/issues/2028
| SahAssar wrote:
| I've been developing an app on postgrest (which supabase uses)
| for a while, and my solution to this was to have a NOTIFY trigger
| and a small program that broadcasted the changed ID's over SSE to
| clients. The clients would then fetch the changes.
|
| As for the RLS aspect I considered it "good enough" to allow all
| clients to know the changed ID (an UUID) as long as they got no
| other info, and planned to have the trigger also send the allowed
| groups in the NOTIFY so that the program could filter without
| needing a separate PG connection per each subscriber. In my
| testing it scales very well.
|
| Anyone from supabase that can comment why LISTEN/NOTIFY triggers
| were not used, and why WS was used over SSE when the
| communication is not bi-directional? It'd be interesting to hear
| your thoughts.
| kiwicopple wrote:
| > LISTEN/NOTIFY triggers were not used
|
| We started with this implementation, however there are 2 issues
| with this approach. 1) NOTIFY has a limit of 8000 bytes, so
| large rows are silently dropped. 2) you need to attach the
| notify trigger to every table (vs a Publication, which can
| listen to an entire database in a single line of code).
|
| > why WS was used
|
| The Server is Elixir (Erlang) so it's very good at WS. Over
| time we will add other multiplayer useful features (like
| presence)
| SahAssar wrote:
| Thanks for the response! I have a few more questions if you
| have the time:
|
| If I'm reading the article correctly this feature is only
| available for authenticated users? Is that restriction based
| on scaling or some other limitation?
|
| When I looked at the example query in the article it only
| queries the primary key, does this also work for tables where
| the GRANT hides certain columns from different users?
___________________________________________________________________
(page generated 2021-12-01 23:01 UTC)