Learn how Neon's autoscaling works - it estimates Postgres' working set size and keeps it in memory. Engineering post here

Replicate data with Airbyte

Learn how to replicate data from Neon with Airbyte

Neon's logical replication feature allows you to replicate data from your Neon Postgres database to external destinations.

Airbyte is an open-source data integration platform that moves data from a source to a destination system. Airbyte offers a large library of connectors for various data sources and destinations.

In this guide, you will learn how to define your Neon Postgres database as a data source in Airbyte so that you can stream data to one or more of Airbyte's supported destinations.

Prerequisites

Prepare your source Neon database

This section describes how to prepare your source Neon database (the publisher) for replicating data to your destination Neon database (the subscriber).

Enable logical replication in Neon

important

Enabling logical replication modifies the Postgres wal_level configuration parameter, changing it from replica to logical for all databases in your Neon project. Once the wal_level setting is changed to logical, it cannot be reverted. Enabling logical replication also restarts all computes in your Neon project, meaning active connections will be dropped and have to reconnect.

To enable logical replication in Neon:

  1. Select your project in the Neon Console.
  2. On the Neon Dashboard, select Settings.
  3. Select Logical Replication.
  4. Click Enable to enable logical replication.

You can verify that logical replication is enabled by running the following query from the Neon SQL Editor:

SHOW wal_level;
 wal_level
-----------
 logical

Create a Postgres role for replication

It's recommended that you create a dedicated Postgres role for replicating data. The role must have the REPLICATION privilege. The default Postgres role created with your Neon project and roles created using the Neon CLI, Console, or API are granted membership in the neon_superuser role, which has the required REPLICATION privilege.

The following CLI command creates a role. To view the CLI documentation for this command, see Neon CLI commands — roles

neon roles create --name replication_user

Grant schema access to your Postgres role

If your replication role does not own the schemas and tables you are replicating from, make sure to grant access. For example, the following commands grant access to all tables in the public schema to Postgres role replication_user:

GRANT USAGE ON SCHEMA public TO replication_user;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO replication_user;
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO replication_user;

Granting SELECT ON ALL TABLES IN SCHEMA instead of naming the specific tables avoids having to add privileges later if you add tables to your publication.

Create a replication slot

Airbyte requires a dedicated replication slot. Only one source should be configured to use this replication slot.

Airbyte uses the pgoutput plugin in Postgres for decoding WAL changes into a logical replication stream. To create a replication slot called airbyte_slot that uses the pgoutput plugin, run the following command on your database using your replication role:

SELECT pg_create_logical_replication_slot('airbyte_slot', 'pgoutput');

airbyte_slot is the name assigned to the replication slot. You will need to provide this name when you set up your Airbyte source.

Create a publication

Perform the following steps for each table you want to replicate data from:

  1. Add the replication identity (the method of distinguishing between rows) for each table you want to replicate:

    ALTER TABLE <table_name> REPLICA IDENTITY DEFAULT;

    In rare cases, if your tables use data types that support TOAST or have very large field values, consider using REPLICA IDENTITY FULL instead:

    ALTER TABLE <table_name> REPLICA IDENTITY FULL;
  2. Create the Postgres publication. Include all tables you want to replicate as part of the publication:

    CREATE PUBLICATION airbyte_publication FOR TABLE <table_name, table_name, table_name>;

    Alternatively, you can create a publication for all tables:

    CREATE PUBLICATION airbyte_publication FOR ALL TABLES;

    The publication name is customizable. Refer to the Postgres docs if you need to add or remove tables from your publication.

note

The Airbyte UI currently allows selecting any table for Change Data Capture (CDC). If a table is selected that is not part of the publication, it will not be replicated even though it is selected. If a table is part of the publication but does not have a replication identity, the replication identity will be created automatically on the first run if the Postgres role you use with Airbyte has the necessary permissions.

Create a Postgres source in Airbyte

  1. From your Airbyte Cloud account, select Sources from the left navigation bar, search for Postgres, and then create a new Postgres source.

  2. Enter the connection details for your Neon database. You can get these details from your Neon connection string, which you'll find in the Connection Details widget on the Dashboard of your Neon project. For example, given a connection string like this:

    postgresql://alex:AbC123dEf@ep-cool-darkness-123456.us-east-2.aws.neon.tech/dbname?sslmode=require

    Enter the details in the Airbyte Create a source dialog as shown below. Your values will differ.

    • Host: ep-cool-darkness-123456.us-east-2.aws.neon.tech
    • Port: 5432
    • Database Name: dbname
    • Username: replication_user
    • Password: AbC123dEf
  3. Under Optional fields, list the schemas you want to sync. Schema names are case-sensitive, and multiple schemas may be specified. By default, public is the only selected schema.

  4. Select an SSL mode. You will most frequently choose require or verify-ca. Both of these options always require encryption. The verify-ca mode requires a certificate. Refer to Connect securely for information about the location of certificate files you can use with Neon.

  5. Under Advanced:

    • Select Read Changes using Write-Ahead Log (CDC) from available replication methods.
    • In the Replication Slot field, enter the name of the replication slot you created previously: airbyte_slot.
    • In the Publication field, enter the name of the publication you created previously: airbyte_publication. Airbyte advanced fields

Allow inbound traffic

If you are on Airbyte Cloud, and you are using Neon's IP Allow feature to limit IP addresses that can connect to Neon, you will need to allow inbound traffic from Airbyte's IP addresses. You can find a list of IPs that need to be allowlisted in the Airbyte Security docs. For information about configuring allowed IPs in Neon, see Configure IP Allow.

Complete the source setup

To complete your source setup, click Set up source in the Airbyte UI. Airbyte will test the connection to your database. Once this succeeds, you've successfully configured an Airbyte Postgres source for your Neon database.

Configure a destination

To complete your data integration setup, you can now add one of Airbyte's many supported destinations, such as Snowflake, BigQuery, or Kafka, to name a few. After configuring a destination, you'll need to set up a connection between your Neon source database and your chosen destination. Refer to the Airbyte documentation for instructions:

References

Need help?

Join our Discord Server to ask questions or see what others are doing with Neon. Users on paid plans can open a support ticket from the console. For more detail, see Getting Support.

Last updated on

Was this page helpful?