A Salute to Read Replicas

Published in

AWS in Plain English

6 min readJul 5, 2022

The database is the heart of applications. If your database server goes down, or is incredibly choked due to high volume and starts grinding to a halt, it doesn’t really matter if your application servers are still up — they wouldn’t be able to do useful things for your user without talking to the database server.

The Cloud gives us a really easy way to scale database compute capacity and resilience: the Read Replica!

What’s a Read Replica?

Read replicas are pretty straightforward — they are an up-to-date, read-only copy of your primary database instance.

In AWS, Amazon Relational Database Service (RDS) allows you to create read replicas of your primary DB instance with basically zero effort. RDS uses asynchronous replication to keep the read replicas up-to-date with the master. The specific replication technology used varies, depending on what database engine your primary DB uses (e.g., MySQL, Oracle, PostgreSQL, etc)

Read replicas have distinct endpoints, different from the primary DB instance. Your application will have to be configured to connect to the correct endpoint (primary vs read replica), depending on what it needs to do. That means read+write workloads are directed to the primary DB endpoint, while your read-only workloads (e.g., dashboards and report generation) are directed to the read replica.

Read replicas are not free — they are priced just like a primary instance. If your read replica is the same size as your primary instance, then it would cost the same. If it is larger or smaller (read replicas don’t need to be exactly the same size as the primary DB instance), then the pricing adjusts as you would expect.

OK, So What’s a Read Replica Good For?

The primary benefit of a read replica is making your database more performant. Since a read replica is effectively a duplicate server, you get that extra compute capacity for your database needs.

And it’s not just that you get twice the computing power — it’s how you get it.

If you doubled the size of your primary DB instance, instead of adding a read replica to it, sure, you’d get equivalent total specs on paper. But with the read replica setup, you can effectively partition your workloads so that heavy read processing can’t bog down your critical transaction processing. If you merely doubled the size of the primary DB instance, a surge of heavy report generation and dashboarding could suddenly slow-down the entire DB instance and affect other areas of your application, such as accepting and encoding user transactions and other sorts of data ingestion.

And that’s not all! Read replicas also give you an availability improvement. If you wanted Availability Zone (AZ)-level high availability, you could place your read replica in a different AZ from your primary DB instance. When your primary DB goes down, whether just an instance problem or a legitimate AZ-level service disruption, your read replica can be promoted to be a standalone DB, becoming the new primary DB instance. This takes only minutes — a lot faster than if you had to manually create a new primary DB instance from scratch using a backup.

Use Cases

Let’s try to bring down what we’ve learned so far to more specific and easily-grokable use cases.

Q: JV, I’m running a database in RDS, and a couple times every week it slows down to a crawl, affecting all of our operations dependent on that DB. I found that it was because of a surge of mid-day or end-day report generation from a bunch of users. What can I do to improve our service?

A: Create a read replica. Point your applications to it for report generation. This way your primary DB instance will be protected from any reporting surge that happens.

Q: JV, I manage an RDS DB instance. To comply with new IT mandate, our service needs to survive an entire AZ outage. What’s the best way to do this as fast as possible with the least amount of effort?

A: Create a read replica. Place it in a different AZ. When your primary DB instance goes offline due to an AZ outage, promote your replica. As a bonus, you can also point all your heavy read workloads to the read replica endpoint, instead of the primary, giving you a boost in performance and protecting your primary DB instance from sudden read surges that end up choking all your other critical transactions.

Q: JV, I have an RDS DB instance that serves three distinct but equally important internal customers. The Sales Department primarily uses it to encode tons of sales, customers, and item data every single day. The Marketing Department primarily uses it to generate dashboards and complex reports. The Logistics Department also uses it for their own set of heavy reports, though they typically use less than the Marketing Department based on logs. Sometimes, one of the departments ends up bogging the system down too much with heavy reporting usage, and it prevents the other departments from getting work done, especially Sales. What’s the best way to fix this so that all my internal customers are happy?

A: Create read replicas. Create one read replica for the report generation and dashboards of the Marketing Department, and another (possibly smaller) read replica for the report generation of the Logistics Department. Sales should remain targeting the primary instance, since they are write-heavy. Your internal customers are now effectively insulated against each other. For cost-efficiency, make sure to adjust the sizes of your instances as appropriate. Furthermore, if you are in an AWS Region that has more than two AZs, your primary and two replicas can be in three different AZs for increased resilience.

Q: JV, I have an RDS DB instance that powers our company’s website and storefront. I need to prepare it for a surge of customers for tomorrow’s “special sale” event. From past experience, we get almost 5x more traffic from people browsing our storefront, and it usually ends up congesting our DB server. What’s the easiest way I can scale up my capacity?

A: Create read replicas. The quantity and size of the read replicas will depend on your expected read surge. Load balance read requests (such as all storefront viewing and item querying) against your read replicas (each one will have a distinct endpoint) using either your native application platform’s endpoint load (e.g., some PHP database drivers can do this for you), or you can use Amazon Route 53 weighted record sets to distribute requests across your read replicas. If the extra order volume (assuming your sales event also results in more customer orders, and not just views) can now be easily handled by the current primary DB instance size after offloading read requests to your cluster of replicas, then you have nothing more to do. Else, if you are unsure, you can scale up your primary DB instance a bit to be safe. Review the performance logs after the event, so you have better information for your next sale event (e.g., how many replicas, what their sizes should be, and should you also scale up your primary).

Wrap up

There you go, a good dose of read replica love! As a database administrator, you’ll find that a lot of problems can be solved with smart usage of read replicas — from performance to availability and DR.

More content at PlainEnglish.io. Sign up for our free weekly newsletter. Follow us on Twitter, LinkedIn, YouTube, and Discord.

A Salute to Read Replicas

What’s a Read Replica?

OK, So What’s a Read Replica Good For?

Use Cases

Wrap up

Written by JV Roig