Setting Up PostgreSQL Replication Between Google Cloud SQL and an On-Premise Server for Business Continuity

Ensuring business continuity is crucial to minimizing downtime in today’s data-driven world. One effective approach is to set up database replication, which keeps your data synchronized between cloud and on-premise environments. This guide walks through the steps to configure replication between a PostgreSQL database hosted on Google Cloud SQL and an on-premise server, using Docker to deploy PostgreSQL. However, you are free to choose any other method to deploy PostgreSQL on your on-premise server if Docker isn’t your preferred option.

Steps to Set Up the Replication

Here’s a detailed guide on setting up PostgreSQL replication between a Cloud SQL instance in Google Cloud and an on-premise server.

1. Enable Replication in Cloud SQL

First, enable logical replication on your Cloud SQL instance to stream changes (WAL logs) to another server:

In the Google Cloud Console, go to your Cloud SQL instance. Under Flags, configure the following parameters:

cloudsql.logical_decoding = on
max_replication_slots = 10 (adjust based on how many replicas you need)
max_wal_senders = 10(controls how many WAL processes can send data)

Don’t forget to restart the Cloud SQL instance for these changes to take effect.

2. Grant Replication Permissions to the User

Ensure that the user intended for replication has the necessary privileges. You may run the following command to give the user the replication permission:

ALTER USER <your_user> WITH REPLICATION;

Don’t forget to replace <your_user> with the appropriate PostgreSQL user.

It’s important to use a dedicated user specifically for the replication setup, as this enhances security. Whenever possible, avoid using a generic user for this purpose.

3. Create a Publication in Cloud SQL

A publication defines which tables are replicated. For full database replication, create a publication for all tables:

CREATE PUBLICATION <your_publication_name> FOR ALL TABLES;

If you want to create a publication for specific tables only:

CREATE PUBLICATION <your_publication_name> FOR TABLE <schema_name>.<table_name>, <schema_name>.<table_name_2>;

Lastly, if you opt to create a publication for specific schemas only:

CREATE PUBLICATION <your_publication_name> FOR TABLES IN SCHEMA <schema_name_1>, <schema_name_2>;

Don’t forget to replace <your_publication_name> with your chosen name for the publication.

4. Backup the Cloud SQL Database Without Data

Next, create a schema-only backup of your Cloud SQL database, excluding the data. This schema will be used to initialize the on-premise server.

5. Setting Up PostgreSQL on the On-Premise Server

Now, set up PostgreSQL on your on-premise server. If you opt to use Docker, the following command will spin up a PostgreSQL instance:

docker run --name postgres_on_premise -e POSTGRES_USER=<your_user> -e POSTGRES_PASSWORD=<your_password> -d postgres:<version>

You can replace Docker with your own method of deploying PostgreSQL. The replication process should work the same way regardless of how PostgreSQL is deployed on the on-premise server.

6. Restore the Backup on the On-Premise Server

Once the PostgreSQL instance is running, you can setup a connection to the database using a database tool of your choice, but for this guide I recommend using DBeaver. Once connected to the database you may now restore the backup created from the previous step.

7. Whitelist the On-Premise Server’s IP Address on Cloud SQL

Ensure the on-premise server can access Cloud SQL by whitelisting its IP address in the Connections tab of your Cloud SQL instance.

8. Create a Subscription on the On-Premise Server

Create a subscription on your on-premise PostgreSQL instance to receive data from Cloud SQL. Run the following command:

CREATE SUBSCRIPTION <your_subscription_name>
CONNECTION 'host=<cloud_sql_ip> port=5432 dbname=<your_database> user=<your_user> password=<your_password>'
PUBLICATION <your_publication_name>;

Don’t forget to replace <your_subscription_name>, <cloud_sql_ip>, <your_database>, <your_user>, <your_password>, and <your_publication_name> with the relevant values.

9. Enable Synchronous Commit for Data Durability

To ensure data durability and consistency, especially in business continuity scenarios, enable synchronous commit:

ALTER SUBSCRIPTION <your_subscription_name> SET (synchronous_commit = on);

This setting ensures that data changes are acknowledged only after they have been fully replicated to the subscriber. It helps prevent data loss in case of failure by ensuring that both the primary and replica are synchronized before committing transactions.

10. Verify the Publication on Cloud SQL

Run the following queries to check that the publication is set up correctly and that replication slots are available:

SELECT * FROM pg_publication;
SELECT * FROM pg_replication_slots;
SELECT * FROM pg_publication_tables;

These queries should return details about the publication and replication status.

11. Verify the Subscription on the On-Premise Server

On the on-premise PostgreSQL instance, verify that the subscription is active and receiving data:

SELECT * FROM pg_subscription;
SELECT * FROM pg_stat_subscription;

12. Monitor Replication Lag on the On-Premise Server

Check for replication lag to ensure data is being replicated without delay:

SELECT pg_wal_lsn_diff(received_lsn, latest_end_lsn) AS replication_lag
FROM pg_stat_subscription
WHERE subname = '<your_subscription_name>';

You can also use the following query to get more details:

SELECT 
    sub.subname AS subscription_name,
    pg_current_wal_lsn() AS current_lsn,
    pg_last_wal_receive_lsn() AS last_received_lsn,
    pg_last_wal_replay_lsn() AS last_replayed_lsn,
    pg_wal_lsn_diff(pg_current_wal_lsn(), pg_last_wal_receive_lsn()) AS replication_lag
FROM 
    pg_subscription sub
WHERE sub.subname = '<your_subscription_name>';

Best Practices and Considerations

1. Regular Monitoring:
Keep a close watch on replication slots, WAL senders, and network connections. This will help you identify issues before they affect replication performance.

2. Synchronous vs. Asynchronous Replication:
If data consistency is your top priority, enable synchronous replication, which ensures transactions are acknowledged only after replication. For faster performance, you may opt for asynchronous replication, though it comes with the risk of losing some data if the primary instance fails.

3. Backup Strategy:
Even though replication provides real-time synchronization, it is not a substitute for a proper backup plan. Always maintain regular backups of your database for added security.

4. Network Stability:
Ensure a reliable and secure connection between Cloud SQL and the on-premise server. Any network instability could lead to delays or replication failures.

5. Security Considerations:
While this article touches on basic security measures, it doesn’t cover all aspects of securing a replication setup. It’s essential to follow your organization’s security standards and guidelines, especially regarding network security, data encryption, and authentication. Every environment has unique security requirements, so it’s crucial to collaborate with your IT and security teams to ensure all bases are covered.

Common Pitfalls to Avoid

Forgetting to Whitelist IPs: Make sure the on-premise server’s IP is allowed to access Cloud SQL. Not doing so will block replication.
Incorrect User Privileges: Ensure the PostgreSQL user has the appropriate replication privileges, or the replication will not work.
Replication Slot Limits: Be mindful of how many replication slots are in use. Over-saturating the slots can prevent new replicas from being created.

Conclusion

PostgreSQL replication between Google Cloud SQL and an on-premise server can play a vital role in business continuity planning. This guide outlines the steps needed to set up this replication using Docker as an option for deploying PostgreSQL on the on-premise server. However, you are not limited to using Docker; you can deploy PostgreSQL using any other method you prefer. Keep in mind the best practices, and monitor the replication process regularly to maintain optimal performance.

Additionally, I encourage you to share your unique approach and experience in setting up replication or handling similar tasks. By sharing different methods and lessons learned, we can foster continuous learning and improvement within the community. Every challenge or success provides valuable insights for others, and contributing to this shared knowledge helps us grow together.

mvryo

Search This Blog

Setting Up PostgreSQL Replication Between Google Cloud SQL and an On-Premise Server for Business Continuity

Steps to Set Up the Replication

1. Enable Replication in Cloud SQL

2. Grant Replication Permissions to the User

3. Create a Publication in Cloud SQL

4. Backup the Cloud SQL Database Without Data

5. Setting Up PostgreSQL on the On-Premise Server

6. Restore the Backup on the On-Premise Server

7. Whitelist the On-Premise Server’s IP Address on Cloud SQL

8. Create a Subscription on the On-Premise Server

9. Enable Synchronous Commit for Data Durability

10. Verify the Publication on Cloud SQL

11. Verify the Subscription on the On-Premise Server

12. Monitor Replication Lag on the On-Premise Server

Best Practices and Considerations

Common Pitfalls to Avoid

Conclusion

Labels

Comments

Post a Comment

Popular posts from this blog

Understanding Number Systems: Decimal, Binary, and Hexadecimal

How to Monetize Your API as an Individual Developer While Hosting on Your Own Server?

The Weight of Responsibility: A Developer’s Journey to Balance Passion and Reality

The Hidden Costs of Overdesign and Bad Practices in API Systems

Selenium for Beginners: What, Where, When, and Why to Use It in Automated Testing