[HN Gopher] Policy Engines: Open Policy Agent vs. AWS Cedar vs. ...
       ___________________________________________________________________
        
       Policy Engines: Open Policy Agent vs. AWS Cedar vs. Google Zanzibar
        
       Author : gemanor
       Score  : 45 points
       Date   : 2023-08-17 18:06 UTC (4 hours ago)
        
 (HTM) web link (www.permit.io)
 (TXT) w3m dump (www.permit.io)
        
       | orweis wrote:
       | Founder of Permit.io here- cool that this article grabbed some
       | love. For those of you not sure which is the best from the
       | article- Permit combines all 3 together.
       | 
       | - OPA/REGO or Cedar at the edge, for quick efficient and zero
       | latency policies - And Zanzibar at the cloud control plane to
       | manage the overall picture and relationships
        
       | tptacek wrote:
       | I thought this was a pretty weak writeup. I'm somewhat familiar
       | with Zanzibar and less familiar with OPA or Cedar, and the
       | coverage of Zanzibar was odd and superficial. Zanzibar is a
       | large-scale distributed system whose motivation is handling
       | intricately related sets of ACLs while avoiding vulnerabilities
       | that come from disrespecting causal ordering.
       | 
       | One big "con" to using it is that it's an internal Google system!
       | If you're going to compare it to open-source policy engines, the
       | sane thing to do would be to pick one of the open-source
       | Zanzibar-inspired systems, and compare that.
        
         | orweis wrote:
         | In Zanziabr - The article refers to OSS implementations like
         | SpiceDB or Ory. It's a follow-up to a more in depth article
         | (1), trying to be a lighter read starting point.
         | 
         | - 1: https://www.permit.io/blog/zanzibar-vs-opa
        
           | tptacek wrote:
           | It refers to Zanzibar as a "graphical" system, which I think
           | was the first thing that snagged me on this. Your post does
           | too; I assume this is a language snag? "Graphical" doesn't
           | connote "graph-based" in American idiom, but rather "visual".
           | 
           | I don't think your writeup really captures OPA vs. Zanzibar
           | especially well either, for the reasons given by the SpiceDB
           | person upthread. It just sort of defines away the problem
           | Zanzibar is trying to solve, while claiming that Zanzibar-
           | type systems aren't deployable at the edge --- which is
           | pretty clearly not true?
        
             | orweis wrote:
             | Re: "Graphical" - I can see how that would have that effect
             | :)
             | 
             | To be fair it doesn't really say that, it reads:"Graph-
             | based authorization systems utilize a graphical
             | representation to illustrate relationships between users
             | and resources"
             | 
             | Still, I think Daniel (post author) could have picked
             | better phrasing - I'll ask him to change it.
             | 
             | > "while claiming that Zanzibar-type systems aren't
             | deployable at the edge" For most companies it's extremely
             | impractical; and for a developer (Audience of this article)
             | that simply wants to add performant permissions to their
             | without embarking on a whole devops adventure it's as good
             | as so.
        
         | Dowwie wrote:
         | Another big "con" is its complexity and difficulty to
         | understand/implement. It's the kind of thing that once you have
         | a handle on, you go into business trying to sell it as a
         | service because you look behind you and see a moat.
        
         | evancordell wrote:
         | I've seen the sentiment in this article pop up in a few places,
         | which I'd summarize as: Policy languages like OPA and Cedar are
         | fast to evaluate and simple to write, so you should use it for
         | all of your authorization needs.
         | 
         | But policy engines are only _really_ fast and simple if they
         | already have all of the data they need at evaluation time.
         | 
         | If you look at the examples in the Cedar playground[0], they
         | require you to provide a list of "entities" to Cedar at eval-
         | time. These entities are some (potentially large) chunk of your
         | application's data. And while the policy evaluation over that
         | data may be fast, the round trip to your database is probably
         | not. And then you start to think about caching, data
         | consistency, and so on, and suddenly you're thinking about a
         | lot of the problems that Zanzibar was designed to address (but
         | you're on your own to build it out).
         | 
         | IMO policy engines are best suited for ambient request data:
         | things you already know about a request because of a session, a
         | route, or a network path, and policies that make sense to
         | manage on the same lifecycle as your application.
         | 
         | Disclaimer: I work on SpiceDB[1], a Zanzibar implementation,
         | but I do also like policy engines.
         | 
         | [0]: https://www.cedarpolicy.com/en/playground
         | 
         | [1]: https://github.com/authzed/spicedb
        
           | gemanor wrote:
           | > And then you start to think about caching, data
           | consistency, and so on
           | 
           | If you are looking at OPA or Cedar as a standalone engine,
           | this is the correct assumption. To avoid this hassle, there
           | is an open-source tool called OPAL[1] that will let you run
           | the policy engines with all the sync work without any
           | investment in custom solutions. OPAL has a ready mechanism
           | for data fetching and synchronization, so you can plug it
           | into your application's data and not worry about the data.
           | 
           | Disclaimer: I'm one of the OPA maintainers.
           | 
           | [1] https://github.com/permitio/opal
        
             | evancordell wrote:
             | The article was comparing OPA/Cedar to Zanzibar, which is
             | why my head went there. I did go looking for info on how
             | OPAL deals with caching and consistency and found these:
             | 
             | - Authz data is kept in memory, so what you can authorize
             | over is limited by the memory of the box you run OPAL/OPA.
             | The docs also mention sharding, but I'm not clear on how
             | you actually do that with OPA. [0] Maybe there's another
             | doc that I missed.
             | 
             | - You can get a token representing the last time data was
             | synced to the cache in an OPAL health check, but I'm not
             | clear on how you'd use it to ensure consistency in your
             | application since hydrating the cache is asynchronous. [1]
             | 
             | Anyway, those are the types of things Zanzibar is concerned
             | with, so that comparison (instead of Cedar) would've made
             | more sense to me. Without spending more time on it, I'm not
             | sure if I've represented OPAL correctly above, that's just
             | what I found when I went looking.
             | 
             | [0]: https://docs.opal.ac/faq/#handling-a-lot-of-data-in-
             | opa
             | 
             | [1]: https://docs.opal.ac/faq/#how-does-opal-guarantee-
             | that-the-p...
        
               | gemanor wrote:
               | > I'm not clear on how you actually do that with OPA The
               | sharding is managed from the OPAL control plane, when you
               | configure the data sources you also configure the way the
               | sharding works.
               | 
               | > ensure consistency in your application since hydrating
               | the cache is asynchronous. OPAL use eventual consistency
               | for cache reliability, you can know that data has
               | changed, even before you know what changed.
        
       | jreynoldsdev wrote:
       | What I still struggle to understand with these systems is they
       | seem great for single resource authorization, but how do you
       | perform bulk queries? For example, a user wants to query all
       | blogs they have access to (assuming there are large amounts of
       | them), does that require separate authorization logic in the DB?
        
         | turtles3 wrote:
         | This, especially when you combine it with pagination, filtering
         | and ordering requirements.
         | 
         | Zanzibar implementations (eg. Spicedb, keto etc) offer
         | functionality for listing resources accessible to a given
         | principal, but as far as I can see none have a coherent
         | solution for filtering and ordering.
         | 
         | The only solution I can see to this is as you suggest
         | maintaining a shadow copy of the relationships in your db so
         | you can answer the question with a regular SQL query. This
         | obviously comes with a lot of headaches, and is the sole factor
         | preventing us from adopting one of these systems, so I really
         | hope I'm wrong about this.
        
         | gemanor wrote:
         | Data filtering implementation has different approaches among
         | the mentioned policy engine. For OPA, custom Rego code could
         | return the allowed data, and the caching mechanism will ensure
         | its consistency and reliability. For Zanzibar, since the policy
         | derived from the data relations, data filtering is using is an
         | internal part of the paper. I recommend the following article
         | for more information about policy as code and policy as data to
         | understand the context better -
         | https://www.permit.io/blog/zanzibar-vs-opa
        
         | random3 wrote:
         | From short scans of the papers, at least with Zanzibar, AFAIK
         | you can define entities and relations (think groups of users
         | and directories) and infer rights based on those. I'm assuming
         | Zanzibar backs the actual Goolge 360 document sharing so
         | presumably it would scale for that use-case.
        
           | RandomBK wrote:
           | The google paper refers to the existence of some
           | 'permissions-aware index' (paraphrasing) that's used to
           | answer range queries like this, but doesn't cover how this
           | index would work.
           | 
           | I know various Zanzibar implementations have exposed APIs to
           | solve this problem, but I still don't have a great intuitive
           | understanding of how they work beyond 'push the ACL logic
           | into the data layer', which brings us back to a pre-zanzibar
           | world.
        
         | jen20 wrote:
         | Zanzibar in particular is designed to be able to answer the
         | question "what can this user access?", or "who can access
         | resource X?" as well as "can user Y access resource Y?".
         | 
         | This article from OSO [1] explains how, with references to
         | tweets from Lea Kissner (one of the authors of the paper and
         | implementors) which are unfortunately less useful now that
         | Twitter threads have been vandalised.
         | 
         | [1]: https://www.osohq.com/post/zanzibar
        
         | jzelinskie wrote:
         | Full disclosure: I'm a maintainer of SpiceDB, the most mature
         | open source project inspired by Zanzibar
         | 
         | For this exact use case, SpiceDB created two APIs not available
         | in Zanzibar: LookupSubjects and LookupResources. For other
         | scenarios, there's also a BulkCheck API to performing many
         | checks with less request overhead. The sibling comment here is
         | correct that there isn't filtering/sorting available in SpiceDB
         | yet.
         | 
         | Additionally, there are folks using SpiceDB today by
         | replicating denormalized checks back into their database (e.g.
         | Postgres) or search index (e.g. Elastic) so that you can filter
         | them natively. This is the combination of the aforementioned
         | Lookup APIs with our Watch API. While this strategy requires
         | moving parts, it is necessary beyond a particular scale which
         | is well beyond the point at which policy engines typically fall
         | over.
         | 
         | While I'm biased, I do find this article somewhat misleading
         | when describing Zanzibar-inspired systems; it presents opinion
         | without any evidence or examples to justify the claim and
         | concludes it as fact, but that might be because they're leaning
         | on their previous article. Zanzibar is novel because it is
         | fundamentally designed to be ran at the edge and solves the
         | difficult problem of keeping the view of data at the edge
         | consistent. This article conveniently leaves out how other
         | systems get data to the edge while still keeping it consistent
         | for their authorization logic. Latency is also brought up, but
         | we recently managed to scale SpiceDB to >1M requests per second
         | with 100B relationships while maintaining a 5ms p95 measured at
         | the client application[0]. The claim that you absolutely need a
         | service to run a Zanzibar system is a provably false claim
         | based on the number of clusters in the wild running SpiceDB or
         | Ory's Keto project.
         | 
         | [0]: https://authzed.com/blog/google-scale-authorization
        
           | orweis wrote:
           | Jimmy I truly think you're awesome (And so is SpiceDB), but
           | the irony here stands out: "it presents opinion without any
           | evidence or examples to justify the claim and concludes it as
           | fact"
           | 
           | You mean stuff like: 1) "SpiceDB, the most mature open source
           | project inspired by Zanzibar" (though I'd vouch for that one)
           | 2) " it is necessary beyond a particular scale which is well
           | beyond the point at which policy engines typically fall
           | over." 3) "Zanzibar is novel because it is fundamentally
           | designed to be ran at the edge" 4) "we recently managed to
           | scale SpiceDB to >1M requests per second with 100B
           | relationships while maintaining a 5ms p95 measured at the
           | client application" - you should bundle that statement with
           | you need to set it up within your own VPC for it to be fair.
           | 5) "The claim that you absolutely need a service to run a
           | Zanzibar system is a provably false claim based on the number
           | of clusters in the wild running SpiceDB or Ory's Keto
           | project" - how many clusters? :)
           | 
           | Re: "This article conveniently leaves out how other systems
           | get data to the edge while still keeping it consistent for
           | their authorization logic" The article actually does mention
           | OPAL [0]
           | 
           | [0]: https://www.permit.io/blog/introduction-to-opal
        
             | jzelinskie wrote:
             | Your critique of my comment is quite fair; we're both
             | guilty of making claims, but not including all the
             | supporting evidence for brevity's sake. I think we can both
             | agree that everyone working in this space is doing awesome
             | work and bringing authorization the attention that it's
             | sorely needed.
        
               | orweis wrote:
               | Agree 100%. <3 And as I told Joey many times - I'd love
               | to collaborate more with you as well.
        
         | bfeynman wrote:
         | What do you mean separate authorization logic? There are many
         | layers to auth and usually they act as interceptors in request
         | that go very fast. If you have blanket permissions to list, you
         | are able to list resources you have access to... that's
         | trivial. However `Blog` resources might have explicit deny
         | policies on them as well, so yes those are also evaluated. Not
         | sure how else you'd expect it to work sans caching like current
         | state of resources and access.
        
           | ahoka wrote:
           | Yes, you need to consider authorization at every layer. You
           | can blanket deny a lot of things in a midlayer, but sooner or
           | later you need to start interpreting business logic to do the
           | rest.
        
       ___________________________________________________________________
       (page generated 2023-08-17 23:01 UTC)