Feb 15, 2024
Posted by
Carlo Mencarelli
When looking at relational database options on Amazon Web Services (AWS), it can be confusing whether to choose RDS or Aurora. How do they relate to one another? What are the differences? And most importantly, when should you choose RDS, and when should you choose Aurora?
The Relational Database Service (RDS) is AWS’s basic managed database service for relational databases. RDS is one of Amazon’s oldest services, first released in 2009 and supporting only MySQL. It now supports MariaDB, PostgreSQL, Microsoft SQL Server, Oracle DB, and, most recently, IBM’s Db2.
While AWS doesn’t stop you from running and managing a database on an EC2 instance, using a managed service dramatically simplifies the overhead of running and managing one of these databases. In this case, a managed service for AWS means that a lot of the basic administrative tasks are handled by AWS. Examples of these tasks include the installation of the database, patching, scaling, and security, to name a few. This allows you to focus on the development of your application instead of worrying about your backups.
Aurora RDS is another managed database option offering relational databases. Aurora offers both PostgreSQL and MySQL-compatible databases. The databases have enhancements to take advantage of the cloud that they are deployed on. These enhancements include better performance, more reliable storage, and additional features to reduce the overhead of running the databases at any required scale. Aurora was generally available for MySQL in 2015, with PostgreSQL compatibility being released two years later in 2017.
Amazon’s focus for Aurora has generally been to offer ways to take advantage of native cloud capabilities. By doing so, AWS claims that Aurora has multiple times the throughput of the traditional databases they were based on. It also has features that RDS does not, including Aurora Serverless, Global Tables, and MySQL Backtrack.
When Aurora was first released with MySQL compatibility, RDS had already been generally available for six years. Why would the cloud company devote resources to its own database flavors of popular open-source databases?
The CTO of AWS, Werner Vogels, offered a deep dive into the service and its history of the service a few years ago. The traditional database stacks had been stagnant for years before AWS even existed, and with the cloud, novel problems began to emerge. What if part of the availability center failed? How is detached storage at scale supposed to work for racks of databases? Vogels states that their goal was to innovate on the old stack and deliver better performance and scalability.
When choosing relational database services, it’s important to recognize which service fits your needs better. How does Aurora compare to RDS? Which should you choose when considering relational database services? Do you need fast I/O on the database? Perhaps you want to keep the database deployment as simple as possible.
AWS intends to make Aurora the clear choice for organizations looking to create a relational database. Below, we look at a few key considerations to remember when choosing between RDS and Aurora.
Aurora’s claim to fame is the performance increase over traditional RDS. Amazon claims 5x throughput for MySQL and 3x throughput for PostgreSQL. The performance increase is achieved by changing how the database is architected at the foundational level. As Vogels mentions, instead of writing the redo logs to disk in multiple disk writes as traditional databases do, Aurora passes this responsibility to the distributed data layer, offloading a significant amount of resource usage from the machine the database engine uses.
There’s been debate on how well these claims of performance gains hold up, as we’ll see below.
When it comes to databases, one of the last things you want to worry about is how available the system is. Aurora uses a cell-based architecture and was built with how the cloud is architected in mind. By default, Aurora creates a multi-AZ (availability zone) deployment, allowing fast recovery in case of a single availability zone failure. A similar feature is available with traditional RDS, though it has slightly more management overhead since Amazon handles the failover and replication differently between the two services.
Another claim to fame for Aurora is the distributed data layer. Data is written to three availability zones, with each AZ persisting two copies of each write. This replication of data allows for the loss of multiple copies without affecting the read or write performance of the database.
Both Aurora and RDS have a wide array of features to improve the scalability of the database. Both services allow the creation of up to 15 read replicas. These replicas can be within the same AZ, cross-AZ, or even cross-region. Aurora simplifies replica management by creating reader endpoints so the application doesn’t need to keep track of a different address for each read replica.
Aurora also supports storage autoscaling, which increases the database storage as it approaches the limit automatically. This saves administration time and potentially costly outages.
Sometimes, an always-on database isn’t required, or the load for the database is very spiky. For these cases, there is Aurora Serverless. A fully managed service where resources are scaled automatically based on load. Compared to RDS, Aurora Serverless can be both a time and cost-saver if utilized properly. Aurora Serverless works well for non-production environments for even better savings.
If working with a large production environment or on a global scale, Aurora offers the ability to scale the database to multiple regions using the Global Database feature. AWS accomplishes this using the same storage layer improvements that were discussed above. The result is a more resilient database that can tolerate a total region failure while also providing data closer to an application’s users.
Like all AWS services, with RDS and Aurora, you only pay for what you use. Each service has numerous options that may increase price, such as storage amounts, proxy usage, etc. Knowing the I/O profile is also crucial since the savings can add up by choosing the appropriate I/O-optimized option. Fortunately, licensing doesn’t come into the equation since MySQL and PostgreSQL both have zero license costs for Aurora and RDS.
When creating a new database, what are the circumstances in which you choose RDS? If you are already running another database in RDS, hosting your new one there might make sense for consistency's sake. RDS allows you to free up engineering resources by abstracting some of the administrative requirements of database management out of your team.
Even if you feel there are reasons for running your database in RDS, the costs need to add up. Understanding RDS pricing doesn’t need to be complicated; we simplified much of it in a previous article. There should be clear cost savings in using RDS over hosting the database yourself.
If you decide to use RDS, you then have another decision. Should you use traditional RDS or Aurora RDS? Sometimes, the choice is clear, for instance, if you need to use an Aurora-specific feature such as Aurora Serverless or Backtrack. Other times, the choice to use Aurora might be less clear.
If you are starting a new project and are unattached to traditional RDS, or you are migrating your database from a non-RDS PostgreSQL or MySQL server, the performance claims and feature set of Aurora RDS might be compelling. You’ll want to ensure you check the performance numbers for your workload as closely as possible since there have been some investigations into AWS’s 5x and 3x throughput claims; it’s debatable how achievable the quoted numbers are.
Aurora and RDS are good for general-purpose workloads and typical relational database needs. Using a purpose-built database makes a lot of sense if you have specific requirements. Time-series data operating on an append-only ingest pattern is a great example. PostgreSQL can do this, but Timescale was built for it. Offering faster performance for time-series data with PostgreSQL compatibility, Timescale can offer up to 350 % faster queries than RDS PostgreSQL.
Benchmarked against Aurora Serverless v2, Timescale was 35 % faster to ingest, 1.15x-16x faster to query, 95 % more efficient at storing data, and much cheaper.
Try a fully managed TimescaleDB instance today and see the difference.