Migrate from Firebase Firestore to Neon Postgres
Learn how to migrate your data from Firebase Firestore to Neon Postgres using a custom Python script
This guide describes how to migrate data from Firebase Firestore to Neon Postgres.
We'll use a custom Python script to export data from Firestore to a local file, and then import the data into Neon Postgres. This approach allows us to handle Firestore's document-based structure and convert it into the relational database format suitable for Postgres.
Prerequisites
-
A Firebase project containing the Firestore data you want to migrate.
-
A Neon project to move the data to.
For detailed information on creating a Neon project, see Create a project.
-
Python 3.10 or later installed on your local machine. Additionally, add the following packages to your Python virtual environment:
firebase_admin
, which is Google's python SDK for Firebase andpsycopg
, which is used to connect to Neon Postgres database.You can install them using
pip
:pip install firebase-admin "psycopg[binary,pool]"
Retrieve Firebase credentials
This section describes how to fetch the credentials to connect to your Firebase Firestore database.
- Log in to your Firebase Console and navigate to your project.
- Go to Project settings (the gear icon next to "Project Overview" in the left sidebar).
- Under the Service Accounts tab, click Generate new private key. This will download a JSON file containing your credentials.
- Save this JSON file securely on your local machine. We'll use it in our Python script.
For more information, please consult the Firebase documentation.
Export data from Firestore
In this step, we will use a Python script to export data from Firestore. This script will:
- Connect to Firestore
- Retrieve all collections and documents
- Save the Firestore documents to a format suitable for ingesting into Postgres later
Here's the Python script:
import argparse
import json
import os
from collections import defaultdict
import firebase_admin
from firebase_admin import credentials, firestore
def download_from_firebase(db, output_dir):
# Create output directory if it doesn't exist
if not os.path.exists(output_dir):
os.makedirs(output_dir)
# Initialize a defaultdict to store documents for each collection
output: dict[str, list[dict]] = defaultdict(list)
def _download_collection(collection_ref):
print(f"Downloading from collection: {collection_ref.id}")
# Determine the parent path for the current collection
if collection_ref.parent:
parent_path = collection_ref.parent.path
else:
parent_path = None
# Iterate through all documents in the collection
for doc in collection_ref.get():
# Add document data to the output dictionary
output[collection_ref.id].append(
{
"id": doc.reference.path,
"parent_id": parent_path,
"data": doc.to_dict(),
}
)
# Recursively handle subcollections
for subcoll in doc.reference.collections():
_download_collection(subcoll)
# Start the download process with top-level collections
for collection in db.collections():
_download_collection(collection)
# Save all (sub)collections to corresponding files
for collection_id, docs in output.items():
with open(os.path.join(output_dir, f"{collection_id}.json"), "w") as f:
for doc in docs:
f.write(json.dumps(doc) + "\n")
def main():
parser = argparse.ArgumentParser(
description="Download data from Firebase Firestore"
)
parser.add_argument(
"--credentials", required=True, help="Path to Firebase credentials JSON file"
)
parser.add_argument(
"--output",
default="firestore_data",
help="Output directory for downloaded data",
)
args = parser.parse_args()
# Initialize Firebase app
cred = credentials.Certificate(args.credentials)
firebase_admin.initialize_app(cred)
db = firestore.client()
# Download data from Firebase
download_from_firebase(db, args.output)
print(f"Firestore data downloaded to {args.output}")
if __name__ == "__main__":
main()
Save this script as firebase-download.py
. To run the script, you need to provide the path to your Firebase credentials JSON file and the output directory for the downloaded data. Run the following command in your terminal:
python firebase-download.py --credentials path/to/your/firebase-credentials.json --output firestore_data
For each unique collection id, this script creates a line-delimited JSON file, and all documents in that collection (spanning different top-level documents) are saved to it. For example, if you have a collection with the following structure:
/users
/user1
/orders
/order1
/order2
/items
/item1
/item2
/user2
/orders
/order3
The script will create the following files:
users.json
: Contains all user documents, i.e.,user1
,user2
.orders.json
: Contains all order documents across all users -order1
,order2
,order3
.items.json
: Contains all item documents across all orders -item1
,item2
.
Each file contains a JSON object for each document. To illustrate, order1
gets saved to orders.json
in the following format:
{
"id": "users/user1/orders/order1",
"parent_id": "users/user1",
"data": {
"order_date": "2023-06-15",
"total_amount": 99.99
}
}
This structure allows for easy reconstruction of the hierarchical relationships between users, orders, and items, while also providing a flat file structure that's easy to process and import into other systems.
Prepare your Neon destination database
This section describes how to prepare your destination Neon Postgres database to receive the imported data.
Create the Neon database
- In the Neon Console, go to your project dashboard.
- In the sidebar, click on Databases.
- Click the New Database button.
- Enter a name for your database and click Create.
For more information, see Create a database.
Retrieve Neon connection details
-
In the Neon Console, go to your project dashboard.
-
Find the Connection Details widget, and toggle to the correct
Database
option. -
Copy the connection string. It will look similar to this:
postgresql://[user]:[password]@[neon_hostname]/[dbname]
Import data into Neon
We use another python script to import the firestore data we previously downloaded into Neon.
import argparse
import json
import os
import psycopg
from psycopg.types.json import Jsonb
def upload_to_postgres(input_dir, conn_string):
# Connect to the Postgres database
conn = psycopg.connect(conn_string)
# Iterate through all JSON files in the input directory
for filename in os.listdir(input_dir):
cur = conn.cursor()
if filename.endswith(".json"):
table_name = filename[:-5] # Remove .json extension
print("Writing to table: ", table_name)
# Create table for the collection if it doesn't exist
create_table_query = f"""
CREATE TABLE IF NOT EXISTS {table_name} (
id TEXT PRIMARY KEY,
parent_id TEXT,
data JSONB
)
"""
cur.execute(create_table_query)
# Read and insert data from the JSON file
with open(os.path.join(input_dir, filename), "r") as f:
insert_query = f"""
INSERT INTO {table_name} (id, parent_id, data)
VALUES (%s, %s, %s)
ON CONFLICT (id) DO UPDATE
SET parent_id = EXCLUDED.parent_id, data = EXCLUDED.data
"""
batch = []
for line in f:
doc = json.loads(line)
batch.append((doc["id"], doc["parent_id"], Jsonb(doc["data"])))
if len(batch) == 20:
cur.executemany(insert_query, batch)
batch = []
# Commit changes
conn.commit()
# Close the cursor and connection
cur.close()
conn.close()
def main():
parser = argparse.ArgumentParser(description="Upload data to Postgres")
parser.add_argument(
"--input",
default="firestore_data",
help="Input directory containing JSON files",
)
parser.add_argument("--postgres", required=True, help="Postgres connection string")
args = parser.parse_args()
# Upload data to Postgres
upload_to_postgres(args.input, args.postgres)
print(f"Data from {args.input} uploaded to Postgres")
if __name__ == "__main__":
main()
Save this script as neon-import.py
. To run the script, you need to provide the path to the input directory containing the JSON files and the Neon connection string. Run the following command in your terminal:
python neon-import.py --input firestore_data --postgres "<neon-connection-string>"
This script iterates over each JSON file in the input directory, creates a table in the Neon database for each collection, and inserts the data into the table. It also handles conflicts by updating the existing data with the new data.
Verify the migration
After running both the Firestore export and the Neon import scripts, you should verify that your data has been successfully migrated:
-
Connect to your Neon database using the Neon SQL Editor or
psql
. -
List all tables in your database:
\dt
-
Run some sample queries to check that the data has been successfully imported. For example, the following query fetches all orders made by the first two customers:
SELECT data FROM orders WHERE parent_id IN ( SELECT id FROM customers LIMIT 2 )
Compare the results with those from your Firestore database to ensure data integrity. Note that using the
parent_id
field, we can navigate through the hierarchical structure of the original data.
Other migration options
While this guide focuses on using a custom Python script, there are other migration options available:
-
Firestore managed export/import
If you have a large volume of data to migrate, you can use the Google Cloud Firestore managed export and import service. It allows you to export your Firestore data to a Google Cloud Storage bucket, from where you can download and ingest it into Neon.
-
Open source utilities
There are also a number of open source utilities available that can help export data from Firestore to local files.
However, these utilities are not as robust as the managed export/import service. If your data size is not big, we recommend using the sample code provided above or adapting it to your specific needs.
Reference
For more information on the tools and libraries used in this guide, refer to the following documentation:
Need help?
Join our Discord Server to ask questions or see what others are doing with Neon. Users on paid plans can open a support ticket from the console. For more details, see Getting Support.