Zanzibar: the distributed authorization standard?

Ankit Trehan
8 min readNov 22, 2024

--

Google’s distributed authorization system powering apps like Google Photos, Drive, YouTube, Google Cloud and Google Maps

Zanzibar is Google’s planet-scale distributed access control list (ACL) storage and evaluation system. It is designed to scale to handle trillions of ACLs and millions of authorization requests per second, supporting services used by billions of people worldwide.

Why a common authorization service?

Authorization checks are essential for ensuring privacy and controlling access to sensitive systems. They dictate fine grained access to specific resources, like deciding whether a user can only view or edit a Google Doc.

A unified authorization service standardizes access control across applications, abstracting away complexity of authorization logic. This can have a few benefits at the scale of Google:

  1. Consistent semantics: It ensures that the user experience and access control rules are uniform across services.
  2. Cross-Application Coordination: It simplifies collaboration between applications managing their access control dependencies.
  3. Reduced Engineering Workload: Developers don’t need to build custom authorization logic for each application, saving significant time and effort.

The Simple Data Model

Since Zanzibar is used by multiple applications, it must provide a flexible but simple data model that serves the use cases of these diverse set of applications. It aims for simplicity in its data model, making it easy for developers to work with the authorization layer. The system stores its data as relation tuples, which define the relationships between users, groups, and objects in a straightforward way. A relation tuple is stored as a optimized binary encoding but can be represented as:

⟨tuple⟩ ::= ⟨object⟩‘#’⟨relation⟩‘@’⟨user⟩

Here, the user can be either an individual user or a set of users (e.g a group). Groups can be nested, creating a chain of nested group relationships. The relation specifies the type of access or relationship to an object, these can be direct or indirect. For example, direct relationships might include viewer or editor, while an indirect relationship might point to a folder (i.e., a parent-child access relationship).

There are two main advantages of the tuple-based approach:

  • Consistent Representation: Access control is represented consistently, regardless of whether the user is part of a group or accessing an individual object.
  • Incremental Updates: Incremental updates to access control lists are simplified since the system operates on relation tuples rather than updating ACLs for each individual object.

Consistency

ACL checks must always reflect the most up-to-date changes and maintain strong consistency to avoid the “new enemy” problem: granting access to users who shouldn’t have it due to inconsistencies. This can occur if the system fails to respect the proper order of ACL updates or applies outdated ACLs to new resources. For instance, if a user’s access is quickly updated from no access to editor to viewer, but the ACL updates are mishandled, the user may improperly retain editor access. Similarly, if a parent folder’s ACL is updated to remove a user’s access, but a new resource is added, the stale ACL might still grant that user access when it shouldn’t.

To prevent such issues, the authorization system must guarantee two things: causal consistency and bounded staleness of reads.

To achieve these guarantees, Zanzibar uses a protocol called zookie. Here’s how it works:

  1. When a client modifies content, it requests an opaque consistency token, a zookie, for the ACL version. Zanzibar returns a global timestamp with the zookie, ensuring all prior ACL updates have lower timestamps. Zanzibar relies on Spanner’s TrueTime mechanism to assign each ACL create the global microsecond-resolution timestamp. The client then stores the zookie alongside the content.
  2. For future ACL checks, the client ensures that the check is at least as fresh as the timestamp of the content version by supplying the opaque zookie back to Zanzibar.

In this system, a zookie is an opaque byte sequence encoding a globally meaningful timestamp, which represents an ACL write, a content version, or a read snapshot.

Here’s how this addresses the guarantees listed above:

When performing check requests, Zanzibar can easily evaluate causal relationships between changes, such as two events x ≺ y with timestamps Tx < Ty. For content updates, any check after a content update with timestamp Tc (i.e., a ≥ Tc) ensures that all prior checks have accounted for the update. Therefore, when a check on new content is required, the zookie with the latest timestamp guarantees that policies only after the content change were applied. In the above example of a stale ACL, the zookie would represent a timestamp when the new resource was added. This means that it would evaluate everything up to that timestamp which would include the change in access.

Namespaces

In Zanzibar, namespaces define how relations are structured and configured for a resource, including rules about which relations can access others. For example, you might have a rule that all owners of a resource automatically have editor access to it:

relation { name: "owner" }
relation {
name: "editor"
userset_rewrite {
union {
child { _this {} }
child { computed_userset { relation: "owner" } }
}}}

This configuration ensures that the system can handle a variety of complex access control scenarios efficiently.

API

Zanzibar exposes an API to allow clients to read and update ACLs easily. Clients will query and read relation tuples to display ACLs and group membership to users, or to prepare for a subsequent write using the Read API. If the request does not include a zookie, the system picks the most reasonable recent timestamp for the query. This read API operates at one level of nesting, providing direct relationships rather than following the full chain of access.

The Expand API, however, returns the effective userset for a given resource, which includes all necessary information for building search indices or assessing group memberships. It follows all the indirect relationships to provide this information which the paper calls pointer chasing.

A content-change request doesn’t carry a zookie and is evaluated at the latest snapshot. But if the request is authorized, the system returns a zookie with the updated timestamp, which clients store along with the content. Future authorization checks use this zookie to ensure they reflect the latest state of access control.

The Write API allows clients to update a relationship tuple or modify all tuples related to an object via a read-modify-write process with optimistic concurrency. The write call reads all relation tuples of an object including a per object lock tuple. The clients takes this information and updates the tuples it wants to and sends a request to write back to Zanzibar. Zanzibar updates the resource if it finds that the lock tuple hasn’t been modified since the read request.

Finally, we have the Check API which returns a boolean value based on the relation tuples defined for a user and object. These requests, the most common in the system, can be tricky to do quickly due to the heavily nested nature of the storage model.

Architecture overview

One of the main considerations with Zanzibar is it’s latency. Since it is a part of the total latency of an application, it needs to be very quick. It is designed with this constraint in mind:

  • ACL Servers: Handle API requests, with traffic being distributed across the cluster to ensure efficient processing. Requests can arrive at any server and they get forwarded out to other servers in the cluster as necessary. The initial servers are responsible for collecting all the responses and responding to the client.
  • Data Storage: Multiple Spanner databases store relation tuples, namespace configurations, and changelog data. Data is replicated across regions to minimize latency for clients. One database stores all the relation tuples for each client namespace, one database to hold all namespace configurations and one changelog database shared across all namespaces.
  • Watch Servers: Handle watch requests, allowing clients to track changes in ACL configurations and maintain secondary indexes of relation tuples.

Zanzibar runs a periodic background task of clean up in the background in order to garbage-collect tuple versions older than a threshold per namespace.

Leopard Indexing System

Authorization checks are the most frequently executed operation in Zanzibar and are also notoriously difficult to evaluate quickly due to the either wide or deep nested relationships. This is why the system is optimized to handle them as efficiently as possible. Zanzibar uses the Leopard indexing system to precompute indirect relationships, speeding up access checks for users and groups.

There are two key indexing models:

  • GROUP2GROUP(s)→{e}, where s represents an ancestor group and e represents a descendent group that is directly or indirectly a sub-group of the ancestor group.
  • MEMBER2GROUP(s) → {e}, where s represents an individual user and e represents a parent group in which the user is a direct member.

To evaluate whether user U is a member of group G, we can now check the set directly by:

(MEMBER2GROUP(U) ∩ GROUP2GROUP(G)) != 0

In simpler terms, the indexing system creates a set of relationships by flattening them. This allows Zanzibar to calculate these relationships directly and evaluate access rights efficiently. Comparing this to the recursively following indirect links up the chain (the paper calls this pointer chasing), it is a much quicker evaluation.

The system described works well for deep and widely nested relationships, but fails for a fresh and consistent snapshot. In those instance, the Leopard servers maintain an online incremental layer that can query out to the changelog service when required for fresher timestamps that its offline index. The Leopard system is always ingesting changes from the changelog data store and trying to keep up this online incremental layer up to date. Each ACL change can lead to tens of thousands of changes in Leopard’s system.

Caching

To further reduce latency, Zanzibar uses caching for frequently accessed keys and request hedging to handle slow requests. Indirect request checks are also cancelled to stop them from propagating further if the parent ACL check has finished.

Caching is carried out on both the delegate and the delegator side. The paper provides statistics for its cache hit rate. Since the request timestamps and relationships are constantly changing and so are the servers handling them, the headline hit rate seems quite low: 10% on the delegate side and 2% on the delegator side.

When a request is slow, it is sent to multiple servers as a hedge, and whichever server responds first is used, while others are canceled. This helps mitigate bottlenecks in the distributed system and ensures timely responses.

Takeaways

Zanzibar provides a flexible and efficient data model for managing access control at scale. It uses advanced techniques to ensure consistency, low latency, and high availability, even when handling complex group memberships and deeply nested authorization rules. With features like the Leopard index for group membership evaluation and caching to handle hotspots, it combines multiple distributed system theories. It thus mitigates hot spots, a critical production issue when serving data on top of normalized, consistent storage. Zanzibar is built to scale to trillions of access control rules and millions of authorization requests per second.

The most impressive aspect of Zanzibar is its ability to power large-scale services like Google’s, maintaining both high performance and strong consistency. The zookie protocol, in particular, is an elegant solution to managing causal dependencies and ensuring timely access control decisions.

As companies develop their own authorization systems, Zanzibar presents a powerful example of how to scale access control in a distributed environment. The question I have is: Is this the standard other organizations should aim for when designing their authorization layers?

Sources:

  1. https://research.google/pubs/zanzibar-googles-consistent-global-authorization-system/

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Ankit Trehan
Ankit Trehan

No responses yet

Write a response