Provision thousands of Postgres databases programmatically via the Neon API - Used by Replit Agent

The anon extension

new

Protecting sensitive data in Postgres databases

The anon extension (PostgreSQL Anonymizer) provides data masking and anonymization capabilities to protect sensitive data in Postgres databases. It helps protect personally identifiable information (PII) and other sensitive data, facilitating compliance with regulations such as GDPR.

This guide introduces the anon extension and demonstrates how to implement masking rules for static data anonymization, which is currently the only supported masking method.

Try it on Neon!

Neon is Serverless Postgres built for the cloud. Explore Postgres features and functions in our user-friendly SQL editor. Sign up for a free account to get started.

Sign Up

Enable the extension

note

The anon extension is currently experimental and may change in future releases.

Enable the anon extension in your Neon database by following these steps:

  1. Connect to your Neon database using either the Neon SQL Editor or an SQL client like psql

  2. Enable experimental extensions:

    SET neon.allow_unstable_extensions='true';
  3. Install the extension:

    CREATE EXTENSION IF NOT EXISTS anon;

Masking rules

Masking rules define which data to mask and how to mask it using SQL syntax. These rules are applied using SECURITY LABEL SQL commands and stored within the database schema to implement the privacy by design principle.

Masking functions

The anon extension provides built-in functions for different anonymization requirements, including but not limited to:

Function TypeDescriptionExample
FakingGenerate realistic dataanon.fake_first_name() and anon.lorem_ipsum()
PseudonymizationCreate consistent and reversible fake dataanon.pseudo_email(seed)
RandomizationGenerate random valuesanon.random_int_between(10, 100) and anon.random_in_enum(enum_column)
Partial scramblingHide portions of stringsanon.partial(ip_address, 8, ''XXX.XXX'', 0) would change 192.168.1.100 to 192.168.XXX.XXX
NullificationReplace with static values or NULLMASKED WITH VALUE 'CONFIDENTIAL'
Noise additionAlter numerical values while maintaining distributionanon.noise(salary, 0.1) adds +/- 10% noise to the salary column
GeneralizationReplace specific values with broader categoriesanon.generalize_int4range(age, 10) would change 54 to [50,60)

Static masking

Static masking permanently modifies the original data in your tables. This approach is useful for creating anonymized copies of data when:

  • Migrating production data to development branches
  • Creating sanitized datasets for testing
  • Archiving data with sensitive information removed
  • Distributing data to third parties

Implementation example

  1. Create a sample table

    Create a sample users table with sensitive information:

    CREATE TABLE users (
        id SERIAL PRIMARY KEY,
        username VARCHAR(50) UNIQUE,
        email VARCHAR(255),
        phone_number VARCHAR(20),
        city VARCHAR(100)
    );
    
    INSERT INTO users (username, email, phone_number, city) VALUES
        ('john_doe', 'john.doe@example.com', '555-123-4567', 'New York'),
        ('jane_smith', 'jane.smith@example.com', '555-987-6543', 'Los Angeles'),
        ('peter_jones', 'peter.jones@example.com', '555-555-1111', 'Chicago');
  2. Define masking rules

    Apply masking rules to specific columns using SECURITY LABEL:

    -- Mask email addresses with realistic-looking but fake emails
    SECURITY LABEL FOR anon ON COLUMN users.email
    IS 'MASKED WITH FUNCTION anon.dummy_safe_email()';
    
    -- Partially mask phone numbers, preserving the first digit and last two digits
    SECURITY LABEL FOR anon ON COLUMN users.phone_number
    IS 'MASKED WITH FUNCTION anon.partial(phone_number, 1, ''XXX-XXX-'', 2)';
  3. Initialize and apply masking

    The anon.init() function initializes the anon extension by loading default fake data sets and setting up the masking environment. This required step prepares the database for anonymization operations and must be executed before applying any masking functions.

    SELECT anon.init();

    Then apply the masking rules to permanently transform the data:

    warning

    Static masking irreversibly modifies your data. The original values cannot be recovered after anonymization.

    SELECT anon.anonymize_table('users');
  4. Verify results

    After applying the masking, your data will be anonymized according to the defined rules:

    SELECT * FROM users;
    idusernameemailphone_numbercity
    1john_doemcknightjulie@example.org5XXX-XXX-67New York
    2jane_smithdavidhanson@example.org5XXX-XXX-43Los Angeles
    3peter_jonesmichael33@example.org5XXX-XXX-11Chicago

    Note how:

    • Email addresses were replaced with fictional but valid looking addresses
    • Phone numbers only show the first digit and last two digits
    • The username and city columns remain unchanged as no masking rules were defined for them

Limitations

  • Neon currently only supports static masking with the anon extension
  • Additional pg_catalog functions cannot be declared as TRUSTED in Neon's implementation

Conclusion

The anon extension provides a toolkit for protecting sensitive data in Postgres databases. By defining appropriate masking rules, you can create anonymized datasets that maintain usability while protecting individual privacy.

Reference

Need help?

Join our Discord Server to ask questions or see what others are doing with Neon. Users on paid plans can open a support ticket from the console. For more details, see Getting Support.

Last updated on

Was this page helpful?