Learn how Neon's autoscaling works - it estimates Postgres' working set size and keeps it in memory. Engineering post here
Postgres

Postgres Autoscaling: Aurora Serverless v2 vs Neon

Comparing how autoscaling works in both platforms

Post image

With Neon autoscaling just going GA, we’re inviting every Neon user to enable autoscaling for their databases. If you’re new to Neon, you might be here looking for an alternative to Aurora Serverless v2 for your Postgres autoscaling. This blog post provides a quick comparison of the main autoscaling parameters of both platforms.

Neon is a serverless Postgres database that prioritizes development velocity and easy management. Like Aurora Serverless, it decouples storage and compute, and it dynamically scales your database up and down according to load. Unlike Aurora Serverless, Neon offers a premium developer experience via database branching, native connection pooling, API-based operations, and many other features—plus scale-to-zero and a more responsive scaling algorithm, especially for scaling down.

How autoscaling works in Aurora Serverless v2 

This section focuses on describing compute autoscaling, since it’s the topic of this blog post. We’re not focusing on other aspects of the Aurora architecture, like storage or configuring clusters for high availability. All the information in this section is taken from the Aurora documentation.

Aurora Serverless v2 adjusts compute capacity in real-time based on demand, using units called Aurora Capacity Units (ACUs). According to the AWS documentation, 1 ACU represents “approximately 2 GiB of memory and corresponding CPU and networking resources”, which currently means 0.25 vCPU as stated in this recent paper.

Scaling range 

Aurora instances can autoscale from 0.5 ACU all the way up to 128 ACUs. 0.5 ACU is the smallest possible capacity for an instance. Aurora Serverless v2 instances don’t scale down to zero. 

Scaling increments

Aurora Serverless v2 scales up and down in 0.5 ACU increments.

Disruptiveness 

Aurora v2 handles compute autoscaling nondisruptively. Ongoing connections and transactions continue without interruption during capacity adjustments.

Scaling frequency and responsiveness

Aurora Serverless v2 is a proprietary technology, and the specific details of its autoscaling algorithm are not fully disclosed. However, their docs provide some insights into how the scaling mechanism works, together with this recent paper.

For scaling up, Aurora v2 seems to be highly responsive, with the system scaling up “instantly” when it detects that the current capacity is insufficient to handle the load. This responsiveness is achieved through continuous monitoring (every 1 second) of key resources such as CPU, memory, and network usage, which collectively determine when scaling is necessary.

However, when it comes to scaling down, the algorithm is more cautious. When the workload decreases, the system does not immediately reduce capacity to the minimum. Instead, it scales down in stages. This is done to avoid the premature eviction of cached pages, which could reduce the buffer pool size and impact performance. This conservative approach to scaling down helps maintain a stable buffer pool, ensuring that data remains readily available without the need for repeated retrievals, which could introduce latency.

Metering and cost

At the end of the month, Aurora Serverless v2 charges are based on the ACU-hours consumed. For Aurora I/O optimized instances (popular since their launch, since they don’t force extra I/O charges on the user) the price for an ACU-hour is $0.16 /month for us-east-1. As a reminder, currently 1 ACU = ~ vCPU, 2 GB memory. 

Since Aurora Serverless v2 cannot scale to zero, the database incurs costs even during periods of inactivity, contributing to the total bill. For example, let’s break down the cost calculation with an example of a hypothetical instance: 

  • Instance reaches 20 ACU during peak usage for 100 hours. (20 ACU = 5 vCPU, 40 GB memory)
  • During idle time (620 hours) it goes to minimum capacity (0.5 ACU) 
  • Cost for an I/O optimized instance: $0.16 per ACU-hour

Costs calculation:

  • Cost during peak usage: [(20 ACUs) × (100 hours) × ($0.16 /ACU-hour)] = $320
  • Cost during idle time: [(0.5 ACUs) × (620 hours) x ($0.16 /ACU-hour)] = $49.60  

Total monthly cost: $369.60 (this is compute only—not including storage, egress, etc.)

How autoscaling works in Neon 

Neon is open source licensed under Apache v2. You can check out the code for our autoscaling engine here

Similarly to Aurora Serverless, Neon autoscales capacity dynamically in response to load—but the architecture of both platforms has some important differences.

In Neon, compute capacity is measured in compute units (CUs); 1 CU equals 1 vCPU, 4 GiB memory. Notice that 1 CU ≠ 1 ACU. If we use vCPU as reference, it is more so 1 CU = 4 ACU; if we consider the workload memory-bound, a more fair equivalence would be 1 CU = 2 ACU.

Scaling range 

In self-serve pricing plans, Neon instances can currently autoscale from zero to 10 CUs, with more capacity supported via custom plans. Neon instances scale all the way down to zero when inactive, restarting with a cold start time of < 500 ms. 

Scaling increments 

From 0, Neon scales up and down in 0.25 CU increments. 

Disruptiveness 

Neon’s autoscaling process is nondisruptive: ongoing connections and database operations continue smoothly during capacity adjustments. Your database remains responsive and stable, even as resource allocation changes.

Scaling frequency and responsiveness

Neon’s autoscaling algorithm continuously monitors key metrics—CPU load, memory usage, and the estimated working set size—to adjust compute resources as needed. Neon’s approach dynamically adjusts resources based on real-time usage patterns, rather than relying on pre-configured thresholds, ensuring that compute size matches whichever metric demands the most resources.

The Neon algorithm is highly responsive for both scaling up and down. It checks the CPU load average every 5 seconds, adjusting resources quickly if the load is too high. Similarly, it monitors memory usage and dynamically estimates the working set size, ensuring frequently accessed data remains in memory to maximize performance.

Metering and cost

At the end of the month, Neon measures compute consumption in CU-hours consumed (or compute hours).  These are compute units (CU) multiplied by how many hours the instance has been running—a similar concept to Aurora’s ACU-hours. 

But differently than Aurora, Neon’s pricing is subscription-based. The user can choose between different pricing plans (including a Free plan). Every plan includes a certain number of CU-hours within the monthly fee; once the included CU-hours are reached, the user is billed $0.16 /month for every additional CU-hour consumed. As a reminder, 1 CU in Neon = 1 vCPU, 4 GB memory / 1 ACU = 0.25 vCPU, 2 GB memory. Neon’s compute unit price is cheaper than Aurora’s.

This billing via pricing plans, combined with the comparatively lower cost per compute unit and the capacity to scale to zero, makes Neon much more affordable to run than Aurora Serverless v2. Going back to the example of the previous section: 

  • Neon instance consumes 10 CUs during peak usage for 100 hours*
  • During idle time (620 hours) it goes down to zero
  • Pricing plan: Scale ($69/month)

Calculation:

  • Compute hours consumed: [(10 CUs) x (100 hours) = 1000 compute hours]
  • Compute hours included in Scale plan: 750
  • Cost of additional compute hours: [250 x $0.16] = $40/month
  • Total monthly cost: $109/month, including 50GB storage (vs $369.60 in Aurora Serverless for compute only)

* Using the 1 CU = 2 ACU equivalence, the most conservative in terms of costs for Neon. If the workload were CPU-bound, Neon’s costs would be even lower.

Comparison

We’re biased, but our honest takeaways for both autoscaling designs: 

  • Aurora Serverless v2 wins on the larger scaling range, offering a max. capacity of 32 CPUs / 256 GB memory. We’ll be offering larger instance sizes in the Neon plans soon—but by now, if you need larger instances, reach out to us
  • Neon wins on transparency, on its ability to scale down to zero, and on its more responsive scaling down. This, together with the overall lower price, makes Neon more affordable and efficient to run.
Aurora Serverless v2Neon
Compute unit definition1 ACU =~ 0.25 vCPU, 2 GB memoryFull instance duplicates
Scaling range
0.5 to 128 ACUs

Cannot scale to zero
0 to 10 CUs

Larger instances via custom plans

Can scale down to zero
Scaling increment0.5 ACU0.25 CU
Disruptiveness
Nondisruptive scaling during operations

Nondisruptive scaling during operations
Scaling frequency
Responsive, but scaling down is cautious and gradual

Quick adjustments for both scaling up and down
Billing model
Pay-per-use based on ACU-hours consumed; No Free plan 
Subscription-based with overage charges for extra compute hours; With Free plan
Unit cost$0.16 ACU-hour 

1 ACU =~ 0.25 vCPU, 2 GB memory
$0.16 CU-hour 

1 CU =1 vCPU, 4 GB memory
Cost-efficiency
Higher costs due to higher compute unit price, no scale-to-zero,  conservative scaling down 

Highly cost-efficient due to scale-to-zero, responsive algorithm, lower compute unit price 

Start using Neon for free

A big advantage of Neon vs Aurora Serverless is its free tier. Create a Neon account today and start using Neon’s autoscaling yourself, together with all the Neon features, for free. No credit card required to sign up.