Build multi-tenant RAG with Neon's database-per-user model — no nosy neighbors, max isolation, minimal costs
Docs/Partner guide/Configure consumption limits

Configure consumption limits

Learn how to set consumption limits per project with the Neon API

When setting up your integration's billing solution with Neon, you may want to impose some hard limits on how much storage or compute resources a given project can consume. For example, you may want to cap how much usage your free plan users can consume versus pro or enterprise users. With the Neon API, you can use the quota key to set usage limits for a variety of consumption metrics. These limits act as thresholds after which all active computes for a project are suspended.

Metrics and quotas

By default, Neon tracks a variety of consumption metrics at the project level. If you want to set quotas (max limits) for these metrics, you need to explicitly configure them.

Available metrics

Here are the relevant metrics that you can track in order to understand your users' current consumption levels.

Project-level metrics

  • active_time_seconds
  • compute_time_seconds
  • written_data_bytes
  • data_transfer_bytes

These consumption metrics represent total cumulative usage across all branches and computes in a given project, accrued so far in a given monthly billing period. Metrics are refreshed on the first day of the following month, when the new billing period starts.

Branch-level metric

There is an additional value that you also might want to track: logical_size, which gives you the current size of a particular branch.

Neon updates all metrics every 15 minutes but it could take up to 1 hour before they are reportable.

To find the current usage level for any of these metrics, see querying metrics. You can read more about these metrics and how they impact billing here: Usage metrics

Corresponding quotas

You can set quotas for these consumption metrics per project using the quota settings object in the Create project or Update project API.

The quota object includes an array of parameters used to set threshold limits. Their names generally match their corresponding metric:

  • active_time_seconds — Sets the maximum amount of time your project's computes are allowed to be active during the current billing period. It excludes time when computes are in an Idle state due to auto-suspension.
  • compute_time_seconds — Sets the maximum amount of CPU seconds allowed in total across all of a project's computes. This includes any computes deleted during the current billing period. Note that the larger the compute size per endpoint, the faster the project consumes compute_time_seconds. For example, 1 second at .25 vCPU costs .25 compute seconds, while 1 second at 4 vCPU costs 4 compute seconds.
    vCPUsactive_time_secondscompute_time_seconds
    0.2510.25
    414
  • written_data_bytes — Sets the maximum amount of data in total, measured in bytes, that can be written across all of a project's branches for the month.
  • data_transfer_bytes — Sets the maximum amount of egress data, measured in bytes, that can be transferred out of Neon from across all of a project's branches using the proxy.

There is one additional quota parameter, logical_size_bytes, which applies to individual branches, not to the overall project. You can use logical_size_bytes to set the maximum size (measured in bytes) that any one individual branch is allowed to reach. Once this threshold is met, the compute for that particular branch (and only that particular branch) is suspended. Note that this limit is not refreshed once per month: it is a strict size limit that applies for the life of the branch.

Sample quotas

Let's say you want to set limits for an application with two tiers, Trial and Pro, you might set limits like the following:

Parameter (project)Trial (.25 vCPU)Pro (max 4 vCPU)
active_time_seconds633,600 (business month 22 days)2,592,000 (30 days)
compute_time_seconds158,400 (approx 44 hours)10,368,000 (4 times the active hours for 4 vCPUs)
written_data_bytes1,000,000,000 (approx. 1 GiB)50,000,000,000 (approx. 50 GiB)
data_transfer_bytes500,000,000 (approx. 500 MiB)10,000,000,000 (approx. 10 GiB)
Parameter (branch)TrialPro
logical_size_bytes100,000,000 (approx. 100 MiB)10,000,000,000 (approx. 10 GiB)

Guidelines

Generally, the most effective quotas for controlling spend per project are those controlling maximum compute (active_time_seconds and compute_time_seconds) and maximum written storage (written_data_bytes). In practice, it is possible that data_transfer_bytes could introduce unintended logical constraints against your usage. For example, let's say you want to run a cleanup operation to reduce your storage. If part of this cleanup operation involves moving data across the network (for instance, to create an offsite backup before deletion), the data_transfer_bytes limit could prevent you from completing the operation — an undesirable situation where two measures meant to control cost interfere with one another.

Neon default limits

In addition to the configurable limits that you can set, Neon also sets certain branch size limits by default. You might notice these limits in a Get Project response:

  • branch_logical_size_limit (MiB)
  • branch_logical_size_limit_bytes(Bytes)

These limits are not directly configurable. You can query the limits by running the Get project details or Get project list endpoints.

Suspending active computes

What happens when a quota is met?

When any configured metric reaches its quota limit, all active computes for that project are automatically suspended. It is important to understand, this suspension is persistent. It works differently than the inactivity-based autosuspend, where computes restart at the next interaction: this suspend will not restart at the next API call or incoming connection. If you don't take explicit action otherwise, the suspension remains in place until the end of the current billing period starts (consumption_period_end).

See Querying metrics and quotas to find the reset date, billing period, and other values related to a project's consumption.

note

Neon tracks these consumption metrics on a monthly cycle. If you want to track metrics on a different cycle, you need to take snapshots of your metrics at the desired interval and store the data externally. You can also use the Consumption API to collect metrics from across a range of billing periods.

Configuring quotas

You can set quotas using the Neon API either in a POST when you create a project or a PATCH to update an existing project:

Set quotas when you create the project

For performance reasons, you might want to configure these quotas at the same time that you create a new project for your user using the Create a project API, reducing the number of API calls you need to make.

Here is a sample POST in curl that creates a new project called UserNew and sets the active_time_seconds quota to a total allowed time of 10 hours (36,000 seconds) for the month, and a total allowed compute_time_seconds set to 2.5 hours (9,000 seconds) for the month. This 4:1 ratio between active and compute time is suitable for a fixed compute size of 0.25 vCPU.

curl --request POST \
     --url https://console.neon.tech/api/v2/projects \
     --header 'Accept: application/json' \
     --header "Authorization: Bearer $NEON_API_KEY" \
     --header 'Content-Type: application/json' \
     --data '
{
  "project": {
    "settings": {
      "quota": {
        "active_time_seconds": 36000,
        "compute_time_seconds": 9000
      }
    },
    "pg_version": 15,
    "name": "UserProject"
  }
}
' | jq

Update an existing project

If you need to change the quota limits for an existing project — for example, if a user switches their plan to a higher usage tier — you can reset those limits via PATCH request. See Update a project in the Neon API.

Here is a sample PATCH that updates both the active_time_seconds and compute_time_seconds quotas to 30 hours (108,000):

curl --request PATCH \
     --url https://console.neon.tech/api/v2/projects/[project_ID]\
     --header 'Accept: application/json' \
     --header "Authorization: Bearer $NEON_API_KEY" \
     --header 'Content-Type: application/json' \
     --data '
{
  "project": {
    "settings": {
      "quota": {
        "active_time_seconds": 108000,
        "compute_time_seconds": 108000
      }
    }
  }
}
' | jq

Querying metrics and quotas

You can use the Neon API to retrieve consumption metrics for your organization and projects through various endpoints:

EndpointDescriptionPlan AvailabilityDocs
Aggregated account metricsAggregates the metrics from all projects in an account into a single cumulative number for each metricScale and Business plan onlyGet account-level aggregated metrics
Granular metrics per projectProvides detailed metrics for each project in an account at a specified granularity level (e.g., hourly, daily, monthly)Scale and Business plan onlyGet granular project-level metrics for the account
Single project metricsRetrieves detailed metrics and quota information for a specific projectAll plansGet metrics for a single specified project

Resetting a project after suspend

Projects remain suspended until the next billing period. It is good practice to notify your users when they are close to reaching a limit; if the user is then suspended and loses access to their database, it will not be unexpected. If you have configured no further actions, the user will have to wait until the next billing period starts to resume usage.

Alternatively, you can actively reset a suspended compute by changing the impacted quota to 0: this effectively removes the limit entirely. You will need to reset this quota at some point if you want to maintain limits.

Using quotas to actively suspend a user

If you want to suspend a user for any reason — for example, suspicious activity or payment issues — you can use these quotas to actively suspend a given user. For example, setting active_time_limit to a very low threshold (e.g., 1) will force a suspension if the user has 1 second of active compute for that month. To remove this suspension, you can set the threshold temporarily to 0 (infinite) or some value larger than their currently consumed usage.

In addition to setting quota limits against the project as a whole, there are other sizing-related settings you might want to use to control the amount of resources any particular endpoint is able to consume:

  • autoscaling_limit_min_cu — Sets the minimium compute size for the endpoint. The default minimum is .25 vCPU but can be increased if your user's project could benefit from a larger compute start size.
  • autoscaling_limit_max_cu — Sets a hard limit on how much compute an endpoint can consume in response to increased demand. For more info on min and max cpu limits, see Autoscaling.
  • suspend_timeout_seconds — Sets how long an endpoint's allotted compute will remain active with no current demand. After the timeout period, the endpoint is suspended until demand picks up. For more info, see Autosuspend.

There are several ways you can set these endpoint settings using the Neon API: you can set project-level defaults that apply for any new computes created in the project, you can define the endpoint settings when creating a new branch, or you can adjust these settings when creating or updating an endpoint for an existing branch.

See these sample CURL requests for each method.

In this sample, we are setting defaults for all new endpoints created in the project as a whole. The minimum compute size is at 1 vCPU, the max size at 3 vCPU, and a 10 minute (600 seconds) inactivty period before the endpoint is suspended.

These default values are set in the default_endpoint_settings object.

curl --request POST \
     --url https://console.neon.tech/api/v2/projects \
     --header 'Accept: application/json' \
     --header "Authorization: Bearer $NEON_API_KEY" \
     --header 'Content-Type: application/json' \
     --data '
{
  "project": {
    "default_endpoint_settings": {
      "autoscaling_limit_min_cu": 1,
      "autoscaling_limit_max_cu": 3,
      "suspend_timeout_seconds": 600
    },
    "pg_version": 15
  }
}
' | jq

Last updated on

Was this page helpful?