Lessons from Amazon’s DynamoDb paper

6 min readNov 16, 2024

Learning from Amazon’s experiences with distributed NoSQL

In this second installment of our Dynamo Papers series, we’re diving into Amazon’s 2022 DynamoDB paper. If you missed part one, you can catch up here.

How it came to be

The evolution of DynamoDB provides a glimpse into a system that has matured over more than a decade. Its journey offers a lot to think about when designing systems at scale.

DynamoDB evolved from a combination of the original Dynamo system and Amazon’s first attempt at database-as-a-service through SimpleDB. Both these systems had limitations which hindered adoption and caused the development of DynamoDB. For the original Dynamo system, it was single-tenancy, meaning teams had to set up and actively manage their own DynamoDB instances, adding operational complexity. For SimpleDB, it was unpredictable latency and limited table sizes (100GB per table) leading users to divide their data into multiple tables to handle the application’s workload.

Learning from its experiences with Dynamo and SimpleDB, Amazon created DynamoDB: high performance and scale of Dynamo coupled with multi-tenancy that is more than just a key-value store from SimpleDB. Launched in 2012, DynamoDB combined lessons learned from Amazon’s earlier SimpleDB and the original Dynamo, for delivering sub-millisecond latency at any scale.

Core Functional Features:

Before diving into the technical architecture, here are a few defining characteristics of DynamoDB:

Fully Managed Service: Developers can focus on building applications rather than managing infrastructure, as DynamoDB handles the complexities of data storage and management.
Multi-Tenant Architecture: DynamoDB stores data for multiple customers on the same physical hardware, maximizing resource utilization while maintaining strong isolation.
No Table Size Limits: DynamoDB allows tables to scale infinitely without manual intervention.
Predictable Performance: Thanks to automatic partitioning and horizontal scaling, DynamoDB provides sub-millisecond latency even at scale.
High Availability: With an SLA of 99.99% for standard tables and 99.999% for global tables, DynamoDB ensures your data is always accessible.

Architecture

Here’s some highlights of DynamoDB’s architecture:

Partitioning and Replication

DynamoDB splits its tables into contiguous partitions that are replicated across multiple nodes to ensure fault tolerance. This replication helps the system recover quickly in the event of a failure. Every item in DynamoDB is identified by a primary key, which includes a partition key that determines the storage node for the data. DynamoDB supports ACID (Atomicity, Consistency, Isolation, Durability) properties for transactional operations. This allows DynamoDB to provide strong consistency when necessary, without sacrificing high availability and low latency.

DynamoDB uses a leader-election mechanism based on the Multi-Paxos consensus protocol. In this system, any node can trigger a leader election if it detects that the current leader is unresponsive. The leader is responsible for handling write operations and strongly consistent reads, while other replicas serve eventually consistent reads. This model allows DynamoDB to balance between strong consistency and high availability, depending on the use case.

One of the key features that allows DynamoDB to scale efficiently is its approach to replication. When a write occurs, the leader node accepts the request, logs it in a write-ahead log (WAL), and then replicates the write to its replicas. The write is only acknowledged to the client once a quorum of replicas has confirmed the operation.

Replication can be carried out to either a storage replica or a log replica. A storage replica in DynamoDB consists of two key components: the write-ahead logs (WAL) and the B-trees that store the key-value data. To ensure high availability and low latency, DynamoDB takes a proactive approach to node failures. When a node is detected to be down, the system creates a log replica, which copies only the write-ahead logs, rather than the entire B-tree.

This approach is beneficial for two reasons:

Faster Recovery: Copying just the write-ahead logs (instead of the full B-tree) reduces the time it takes to create a new replica from minutes to just a few seconds.
Minimized Disruption: By replicating the write-ahead logs first, the system can quickly reach a quorum of nodes, ensuring that write availability is not affected during recovery. This prevents potential disruptions in write operations, which could otherwise lead to system downtime.

Global Admission Control (GAC)

In earlier versions, DynamoDB required users to pre-provision read and write capacity. This could lead to inefficiencies, particularly when traffic patterns were uneven across partitions.

As an example, suppose you provision 100 units of write capacity for your table and each partition can handle 50 units of capacity, the table would be split across 2 partitions. Since the capacity is statically split per partition, if your application consumes 60 units from one partition and 20 from the other, the first partition requests will be throttled, even though total usage is under the provisioned capacity. To make matters worse, if this increased load is caused by a small amount of contiguous hot keys, increasing capacity to 120 units would split the load across 3 partitions, reducing throughput per partition to 40 units and making throttling more likely.

To resolve this, Amazon introduced Global Admission Control (GAC), which tracks the throughput usage of the entire table across all partitions. With GAC, throughput is no longer tied to individual partitions, meaning that if one partition needs more capacity, it can draw from the global pool. This centralized management of throughput solved the issues of uneven distribution and throttling. GAC is a highly available replicated service that utilizes an in-memory distributed data store to provide the admission control functionality.

Noisy Neighbors & Colocation

DynamoDB’s multi-tenant model along with the GAC introduced a new challenge: colocation. Multiple customers’ data will end up on the same physical machine, and this could lead to noisy neighbors — situations where one tenant’s high traffic impacts the performance of others. With the GAC, this is an especially hard problem to solve because a table’s partition on a node isn’t restricted to the statically provisioned capacity like earlier.

To mitigate this, Amazon moved the responsibility of monitoring throughput to the storage node. When a storage node nears its throughput capacity, it informs the Auto Admin Service, which then finds or provisions a new node for the hot parition. This proactive scaling mechanism ensures that no single tenant can monopolize the resources of a shared node.

Auto-partitioning

While GAC & the auto admin service solved short-term capacity issues, Amazon needed a way to scale for sustained growth. As partition throughput approaches certain limits, DynamoDB automatically splits the partition based on access patterns observed through GAC. This prevents a single partition from becoming a bottleneck as traffic grows.

DynamoDB is also smart enough to know when further partitioning wouldn’t help. If a partition is bottlenecked by a single hot key, splitting it further won’t help, as further partitioning isn’t reducing the throughput to the said partition. This “smart-splitting” mechanism is a very well thought out and designed system.

Takeaways from the paper

The paper itself gives us these takeaways:

Adapting to customers access patterns to reshape physical partitioning automatically decreases latency and improves customer experiences
Designing systems for predictability over absolute efficiency improves system stability

For me, the second point is a key takeaway, that systems should be designed to handle the unexpected without compromising on performance. DynamoDB’s ability to handle sudden bursts in traffic or long-term growth without major disruptions is a key feature. For instance, while caching is often used to improve read performance, we shouldn’t rely on it too heavily as a primary means of scaling. If something can go wrong, it likely will, so we must build systems that can handle failure gracefully.

Conclusion

DynamoDB was a pioneer in the world of NoSQL, cloud-native databases. Its evolution over the years — from addressing single-tenant scalability issues to implementing advanced auto-partitioning and burst handling — has provided invaluable insights into building distributed systems. The lessons learned from real-world challenges, such as managing multi-tenant storage and handling uneven access patterns, have shaped DynamoDB into the powerful service it is today.

The beauty of DynamoDB’s architecture lies in its ability to adapt and scale intelligently. With features like Global Admission Control (GAC) and smart partitioning, DynamoDB is an excellent case study in how systems can evolve to meet growing demands while maintaining performance and availability. The paper is also a rare example of learning from over a decade of development — and there’s a lot we can learn from it.

Sources:

https://assets.amazon.science/33/9d/b77f13fe49a798ece85cf3f9be6d/amazon-dynamodb-a-scalable-predictably-performant-and-fully-managed-nosql-database-service.pdf

Lessons from Amazon’s DynamoDb paper

How it came to be

Core Functional Features:

Architecture

Partitioning and Replication

Global Admission Control (GAC)

Noisy Neighbors & Colocation

Auto-partitioning

Takeaways from the paper

Conclusion

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Ankit Trehan

No responses yet