͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

Forwarded this email? Subscribe here for more

You are now 161,001+ subscribers strong.

Let’s try to reach 162k subscribers by 30 July.

Share this post & I'll send you some rewards for the referrals.

How Amazon S3 Achieves Strong Consistency Without Sacrificing 99.99% Availability 🌟

#79: Break Into Amazon S3 Architecture (4 Minutes)

Neo Kim

Jul 29

READ IN APP

Get my system design playbook for FREE on newsletter signup:

This post outlines how AWS S3 achieves strong consistency. You will find references at the bottom of this page if you want to go deeper.

Share this post & I'll send you some rewards for the referrals.

Note: This post is based on my research and may differ from actual implementation.

Once upon a time, there was a data processing startup.

They convert raw data into a structured format.

Yet they had only a few customers.

So an on-premise server was enough.

But one morning, they got a new customer with an extremely popular site.

This means massive data storage needs.

Yet their storage server had only limited capacity.

So they moved to Amazon Simple Storage Service (S3), an object storage.

It stores unstructured data without hierarchy.

S3 provides a REST API via the web server.

And stores metadata and file content separately for scale. It stores metadata of data objects in a key-value database.

Also it caches the metadata for low latency and high availability.

Strong Consistency vs Eventual Consistency

Although caching metadata offers performance, some requests might return an older version of metadata.

Because there could be network partitions in a distributed architecture. This means writes go to one cache partition, while reads go to another cache partition.

Thus causing eventual consistency.

But they need strong consistency in data processing for ordering and correctness.

While setting up extra app logic for strong consistency increases infrastructure complexity.

Onward.

CodeRabbit: Free AI Code Reviews in VS Code - Sponsor

CodeRabbit brings real-time, AI-powered code reviews straight into VS Code, Cursor, and Windsurf. It lets you:

Get contextual feedback on every commit, not just at the PR stage
Catch bugs, security flaws, and performance issues as you code
Apply AI-driven suggestions instantly to implement code changes
Do code reviews in your IDE for free and in your PR for a paid subscription

Install in VS Code for FREE

S3 Strong Consistency

It’s difficult to achieve strong consistency at scale without performance or availability tradeoffs.

So smart engineers at Amazon used simple ideas to solve this hard problem.

Here’s how:

1. Write Path

They update the metadata cache using the write-through pattern.

It means the write coordinator updates the cache first. Then updates the metadata store synchronously.

Thus reducing the risk of a stale cache.

Also they set up a separate service to track the cache freshness and called it Witness. It stores only the latest version of a data object and keeps it lightweight (in-memory) for low latency. While the write coordinator notifies the witness whenever there’s a metadata update.

Besides they introduced a transaction log in the metadata store. It tracks the order of operations and allows them to check if the cache is fresh.

Ready for the best part?

2. Read Path

Here’s the read request workflow:

The server queries the cache.
It then asks the witness to understand if the cache has the latest data.
The server queries the metadata store only if the cache is stale.

Thus achieving strong consistency.

Put simply, they find out if the metadata cache is fresh using the witness. Think of the witness as a central observer of metadata changes and a checkpoint for reads.

Also they assume the cache to be stale if the server cannot reach the witness. If so, they fetch the data directly from the metadata store.

Yet the witness shouldn’t affect S3’s overall performance or 99.99% availability.

So they scale the witness servers horizontally. And set up automation to replace failed servers quickly.

Besides they redistribute the traffic when a witness server fails.

S3 supports 100 trillion data objects at 10 million requests per second.

It offers strong read-after-write consistency using the witness and a consistent metadata cache.

Thus making the app logic simpler.

Subscribe to get simplified case studies delivered straight to your inbox:

Author Neo Kim; System design case studies — **👋 Find me on LinkedIn | Twitter | Threads | Instagram**

Want to advertise in this newsletter? 📰

If your company wants to reach a 160K+ tech audience, advertise with me.

Neo’s recommendation 🚀

Meet the updated Cerbos Hub: an authorization solution that scales with your product. Manage unlimited tenants, policies, and roles. Enforce contextual & continuous authorization across apps, APIs, AI agents, MCPs, and workloads. Try it for free.

Thank you for supporting this newsletter.

You are now 161,001+ readers strong, very close to 162k. Let’s try to get 162k readers by 30 July. Consider sharing this post with your friends and get rewards.

Y’all are the best.

TL;DR 🕰️

You can find a summary of this article here. Consider a repost if you find it helpful.

References

Share this post & I'll send you some rewards for the referrals.

Comment

Restack