Database Read Replicas vs PolyScale Database Edge Caching

Mark Rendle & Sam Aybar
Jun 29, 2022

Database read replicas are often implemented to help with database scaling challenges. These challenges fall into two categories - reducing load on the origin database and reducing latencies for applications spanning multiple geographic regions.

PolyScale is a database edge cache that has some use case parity with read replicas. A replica and a cache however, by their very nature, are quite different offerings (though they certainly can be used together). In this post we will compare the pros and cons of both and discuss key criteria for architectural decisions for database scaling.

What is a Database Read Replica?

A read replica is a live copy of a primary database that is kept up to date with changes to that source, and provides read-only access to the data. All INSERTs, UPDATEs and DELETEs are run on the primary, and those changes are copied to the replica, usually by using the Write-Ahead Logs (WALs) from the primary database.

Read replicas are commonly used to reduce load on the primary database; frequent or complex queries can be run against the replicas without affecting the performance of the primary. Multiple replicas may be used to allow clients to run many complex queries simultaneously without sacrificing performance. This is especially useful for read-intensive workloads like reporting systems, CRMs or analytics platforms.

Read replica databases can also be hosted in different data centers; useful if the application is hosted in multiple locations. The major cloud hyperscalers - Amazon Web Services (AWS), Microsoft Azure and Google Cloud Platform (GCP), all provide read replica functionality for their managed database services across the major RDBMS engines: PostgreSQL, MySQL, MariaDB and Microsoft SQL Server. We’ll look at those services later in this post.

Read Replica Pros

Data Consistency

Read replicas are by design, a copy of the origin database and guaranteed to receive all data updates from the primary database in an eventually consistent manner, regardless of where the update originated. A read replica gets updated using logs from the primary database, so it always has the latest data (as long as the replication process is running smoothly).

Ease of Deployment

When using a managed database service from a cloud platform such as AWS, Microsoft Azure or GCP, creating a read replica is very simple; you simply choose a data center (geographic location) and/or availability zone and the replica is created and configured automatically. This makes it easy to scale as you can create more replicas as needed (although there is an initial delay while the data synchronizes). Different managed RDBMS platforms have different limits on the number of replicas you can create: at the time of writing, AWS and Microsoft Azure allow up to five replicas; Google Cloud Platform allows up to ten.

Use as Failover / Disaster Recovery

A read replica can be promoted to become the primary database in the event of the original primary becoming unavailable or corrupted. In critical, zero-downtime scenarios this is a valuable feature; however, it is not without its complexities.

Once a replica has been promoted to primary, the process is irreversible. That replica is now the primary database. Applications that write to the database will need to be updated with the new connection string to continue working. Also, any existing replicas that were linked to the original primary will need to be recreated, pointing at the new primary. If it is required to return the primary to the original data center, a new replica must be recreated in that data center, wait for it to synchronize, and then promote that replica to primary again. It can be a complicated process, but it’s better than extended outages.

Read Replica Cons

Replication Lag

Generally, the time between a change being made to the primary database and that change showing up in a replica is on the order of milliseconds within a single data center, and a second between different data centers. However, in some circumstances there can be a significant lag in this process - maybe seconds or even minutes. That means applications reading from replicas are often reading stale data. If the primary database is very busy, with lots of changes being made, the replication process slows down. In some extreme cases, the additional load caused by the replication process itself can cause performance problems on the primary database.

Separating Reads and Writes

Whether you are using read replicas for scaling within a single data center, or for geo-replication for applications running in different data centers, you need to make changes to the application code to work with the replicas. Any data manipulation statements must be run against the primary database; for best performance, any data query statements should be run against a replica. (Some software architects might count this as a pro; the separation of reads and writes is enforced in architectural patterns like CQRS, for example.)

Of course, if you are using a third-party application like a CMS, this may not be an option unless such a feature has been included. For applications like Wordpress, there are plugins for supporting read replicas.

Managing DNS and Connection Strings

Having multiple databases means having multiple DNS names for the servers, and multiple connection strings to manage for the applications. If you have multiple replicas for extra scale, you might want to use round-robin load balancing, meaning not only having separate connection strings for reads vs writes, but you have multiple possible connection strings for reads. Managing that in code, configuration and CI/CD can get complicated.


Perhaps the most significant downside of read replicas is that every read replica is another full instance of the database, even though it is in read-only mode, not accepting writes. A read replica has to be at least as powerful as the primary in terms of CPU count, memory and storage. If the primary has 8 cores, 64GB of memory and 1TB of SSD storage, the read replica needs to match that spec, and so it will cost as much as the primary again. Having five read replicas will add 500% to the database hosting bill.

For geographically distributed read replicas, you will also be charged for the data EGRESS costs for the logs being used to update the replicas. Whether this is significant depends heavily on the particular scenario, but it is worth keeping in mind.

Limited Scale

There are limits to how many read replicas you can have for a database, simply because of the overhead of shipping the Write-Ahead Log (WAL) data to the replicas. For example, Azure SQL Database and AWS RDS limit you to five replicas of a PostgreSQL or MySQL database. This can be a significant limiting factor if you are running application code in a large number of data centers. As well as the hard limit of replicas imposed by a specific cloud provider, limitations can be reached much earlier with a high workload database. Adding additional read replicas to a well utilized database can result in performance being significantly impacted.

Read replica count limitations can also be a critical blocker if the application tier is using serverless, Function-as-a-Service (FaaS) architectures (such as Netlify Edge Functions, Cloudflare Workers etc) or global container platforms like or These environments may have dozens of locations to support which is not achievable with read replica architectures and hence latencies will be impacted.

What is PolyScale?

PolyScale is a plug-and-play global database Cache-as-a-Service (CaaS) that reduces global query latency for databases. Using PolyScale, query execution for database reads is distributed and cached, seamlessly scaling your current database without writing code or altering query or transactional semantics.

PolyScale’s proxies and caches handle database traffic globally using native database wire protocols such as MySQL and PostgreSQL. This means PolyScale looks just like your existing database to your application, so can be used without any code or infrastructure changes: just update your application’s database origin to use PolyScale instead of your database directly.

Reads (SQL SELECTs) are cached geographically close to the requesting source for accelerated performance. All other traffic (INSERT, UPDATE and DELETE etc) passes through to the origin database as normal, remaining strongly consistent.

PolyScale can be used to reduce load on the origin database, just like a read replica. And just like a read replica, PolyScale can provide read access to data close to the client application, and so reduce latency and data transfer over the Internet, with big performance improvements.

So what are the pros and cons of read replicas compared to PolyScale?

PolyScale Pros

Ease of Setup

PolyScale is very easy to set up: click “New Cache” in the dashboard and enter the hostname and port of the origin server. This returns a new connection string to use in your applications, and that’s the only change that needs to be made. PolyScale’s AI automatically infers caching policies for your data based on query patterns with no manual configuration, although of course you can step in and set explicit policies if you choose to.

This is powerful, especially for complex applications that may involve thousands of unique queries. Asking Developers to determine what to cache and for how long is an impossible moving target.

Single Connection String & Read/Write Logic

Because PolyScale is a proxy to your origin database, data manipulation statements are passed straight through, while queries are served from the cache when possible. This means you only need a single connection string for reads and writes in your application.

In addition, there is no need to separate or modify the application logic to manage reads vs writes. Nothing changes in your application codebase, which allows it to work with any third party software already in place.

Global Edge Network

PolyScale has Points-of-Presence (PoPs) in multiple edge locations around the world, meaning you can scale globally, with caches next to your monoliths, edge functions or container services.

With PolyScale in place, scaling applications is simply determined at the application tier, where there are numerous options. PolyScale’s DNS will automatically resolve to the closest location to serve data. The database reads become global, immediately.

Positive Performance Impact on Origin Database

As noted, read replicas incur a performance hit on the origin database, which has to send write-ahead logs to each replica and confirm that the data was written successfully. With high volumes of transactions this can significantly slow down both reads and writes on the origin, regardless of traffic to the read replica instances. By contrast, PolyScale runs queries against the origin only when data is requested from the cache and is not already available (cache miss). This only reduces the origin database load, it never increases it.

Linear Performance and Autoscaling

Read replicas are full databases and are therefore susceptible to the same performance issues of the origin - with the exception of course of read/write contention! For example, a poorly optimized query may run slowly on both the origin and all replicas. Whereas a read replica still has to run every query against its copy of the data, PolyScale’s cache engine will always return cached responses in < 1ms, regardless of the complexity of the query, and with massive concurrency. This performance does not degrade significantly as load increases meaning periods of extremely high load can be served without performance degradation.

Scale for Serverless Edge Functions

Modern architectures make use of endlessly-scalable serverless platforms like AWS Lambda@Edge, Netlify Edge Functions, Cloudflare Workers, etc. Hundreds or thousands of instances of a function may run concurrently, each trying to open its own connection to the database, which can easily exceed the number of connections available on a traditional RDBMS. PolyScale implements its own connection pooling (configurable per cache) within the cache and can cope with very high numbers of concurrent connections. The Pooling mechanism caches the TCP handshake and given the PolyScale PoP is generally closer to the application than the origin, new connections are very fast.

With a read replica, you might need to manually resize instances to cope with increased load, or over-provision and waste cycles and money when the instance is not being used. PolyScale handles on-demand scaling during load spikes automatically, with no additional costs or wasted resources.

Polyglot Persistence

Today, PolyScale supports MySQL, PostgreSQL and MariaDB, with MS SQL server in Early Access. As more databases are supported, scaling data globally across platforms becomes effortless. The setup process and developer experience are the same for every supported platform.

PolyScale Cons

Disaster Recovery of Origin Database

PolyScale provides intelligent caching of queries; it does not keep a complete copy of the origin database, so if that goes offline or becomes corrupted, PolyScale cannot be used as a disaster recovery solution.

The PolyScale network itself however is highly available with automatic failover in the event of a Point of Presence outage.

Data Consistency

As previously noted, read replicas receive all data updates from the origin db. These are considered eventually consistent as there are no guarantees as to when the data will arrive at any given replica. A read replica is a copy of the origin database in its entirety (unless configured otherwise).

When considering caching however, the challenge of data consistency moves to invalidation. How and when is the data invalidated to make sure fresh data is being served? This is a fundamental difference between a database and a cache.

PolyScale addresses cache invalidation using two core principles: AI managed invalidation and Smart Cache Invalidation.

PolyScale inspects each individual SQL query and builds models in real-time, based on many inputs such as how frequently the data is changing (by analyzing the response payload sizes and inter-query arrival times) and adjusting the cache times accordingly. This automates the invalidation process with high confidence, without any human input.

Secondly, PolyScale’s Smart Invalidation will intelligently invalidate any cached data when it sees a data manipulation (DML) statement (e.g. INSERT, UPDATE or DELETE) affecting the relevant table, as long as the DML statement is run against the PolyScale proxy.


Database read replicas can be used as a solution to geo-distribution of data and to optimize performance against busy databases, but at a potential cost of complexity of infrastructure and application configuration and development, as well as a financial cost. Read replicas can also form part of a high availability and backup strategy so it is worth considering such benefits.

When considering the requirements for low latency queries globally, PolyScale provides intelligent caching that is easy to set up and work with, and cost effective compared to running full read replicas. Its global edge network ensures low latency everywhere, without the need to think about deploying the next read replica or the impact of doing so. PolyScale also includes scaling features (such as connection pooling and traffic shaping) that are designed to mitigate design limitations of traditional databases for use at the edge with serverless functions.

Finally, there is of course no reason why PolyScale should not be used in conjunction with a read replica to maximize the reliability and performance of your database infrastructure.

Next Steps

Ready to try Sign up for a free account here (no credit card required). Or try out PolyScale without bringing your own database with our live interactive demo playground.

Read the Quick Start Guide to understand how easily PolyScale can be integrated without code and without deploying servers.

Written by

Mark Rendle & Sam Aybar