Setting Up PostgreSQL Replication Between Google Cloud SQL and an On-Premise Server for Business Continuity
Ensuring business continuity is crucial to minimizing downtime in today’s data-driven world. One effective approach is to set up database replication, which keeps your data synchronized between cloud and on-premise environments. This guide walks through the steps to configure replication between a PostgreSQL database hosted on Google Cloud SQL and an on-premise server, using Docker to deploy PostgreSQL. However, you are free to choose any other method to deploy PostgreSQL on your on-premise server if Docker isn’t your preferred option.
Steps to Set Up the Replication
Here’s a detailed guide on setting up PostgreSQL replication between a Cloud SQL instance in Google Cloud and an on-premise server.
1. Enable Replication in Cloud SQL
First, enable logical replication on your Cloud SQL instance to stream changes (WAL logs) to another server:
In the Google Cloud Console, go to your Cloud SQL instance. Under Flags, configure the following parameters:
cloudsql.logical_decoding = on
max_replication_slots = 10
(adjust based on how many replicas you need)max_wal_senders = 10
(controls how many WAL processes can send data)
Don’t forget to restart the Cloud SQL instance for these changes to take effect.
2. Grant Replication Permissions to the User
Ensure that the user intended for replication has the necessary privileges. You may run the following command to give the user the replication permission:
ALTER USER <your_user> WITH REPLICATION;
Don’t forget to replace <your_user>
with the appropriate PostgreSQL user.
It’s important to use a dedicated user specifically for the replication setup, as this enhances security. Whenever possible, avoid using a generic user for this purpose.
3. Create a Publication in Cloud SQL
A publication defines which tables are replicated. For full database replication, create a publication for all tables:
CREATE PUBLICATION <your_publication_name> FOR ALL TABLES;
If you want to create a publication for specific tables only:
CREATE PUBLICATION <your_publication_name> FOR TABLE <schema_name>.<table_name>, <schema_name>.<table_name_2>;
Lastly, if you opt to create a publication for specific schemas only:
CREATE PUBLICATION <your_publication_name> FOR TABLES IN SCHEMA <schema_name_1>, <schema_name_2>;
Don’t forget to replace <your_publication_name>
with your chosen name for the publication.
4. Backup the Cloud SQL Database Without Data
Next, create a schema-only backup of your Cloud SQL database, excluding the data. This schema will be used to initialize the on-premise server.
5. Setting Up PostgreSQL on the On-Premise Server
Now, set up PostgreSQL on your on-premise server. If you opt to use Docker, the following command will spin up a PostgreSQL instance:
docker run --name postgres_on_premise -e POSTGRES_USER=<your_user> -e POSTGRES_PASSWORD=<your_password> -d postgres:<version>
You can replace Docker with your own method of deploying PostgreSQL. The replication process should work the same way regardless of how PostgreSQL is deployed on the on-premise server.
6. Restore the Backup on the On-Premise Server
Once the PostgreSQL instance is running, you can setup a connection to the database using a database tool of your choice, but for this guide I recommend using DBeaver. Once connected to the database you may now restore the backup created from the previous step.
7. Whitelist the On-Premise Server’s IP Address on Cloud SQL
Ensure the on-premise server can access Cloud SQL by whitelisting its IP address in the Connections tab of your Cloud SQL instance.
8. Create a Subscription on the On-Premise Server
Create a subscription on your on-premise PostgreSQL instance to receive data from Cloud SQL. Run the following command:
CREATE SUBSCRIPTION <your_subscription_name>
CONNECTION 'host=<cloud_sql_ip> port=5432 dbname=<your_database> user=<your_user> password=<your_password>'
PUBLICATION <your_publication_name>;
Don’t forget to replace <your_subscription_name>
, <cloud_sql_ip>
, <your_database>
, <your_user>
, <your_password>
, and <your_publication_name>
with the relevant values.
9. Enable Synchronous Commit for Data Durability
To ensure data durability and consistency, especially in business continuity scenarios, enable synchronous commit:
ALTER SUBSCRIPTION <your_subscription_name> SET (synchronous_commit = on);
This setting ensures that data changes are acknowledged only after they have been fully replicated to the subscriber. It helps prevent data loss in case of failure by ensuring that both the primary and replica are synchronized before committing transactions.
10. Verify the Publication on Cloud SQL
Run the following queries to check that the publication is set up correctly and that replication slots are available:
SELECT * FROM pg_publication;
SELECT * FROM pg_replication_slots;
SELECT * FROM pg_publication_tables;
These queries should return details about the publication and replication status.
11. Verify the Subscription on the On-Premise Server
On the on-premise PostgreSQL instance, verify that the subscription is active and receiving data:
SELECT * FROM pg_subscription;
SELECT * FROM pg_stat_subscription;
12. Monitor Replication Lag on the On-Premise Server
Check for replication lag to ensure data is being replicated without delay:
SELECT pg_wal_lsn_diff(received_lsn, latest_end_lsn) AS replication_lag
FROM pg_stat_subscription
WHERE subname = '<your_subscription_name>';
You can also use the following query to get more details:
SELECT
sub.subname AS subscription_name,
pg_current_wal_lsn() AS current_lsn,
pg_last_wal_receive_lsn() AS last_received_lsn,
pg_last_wal_replay_lsn() AS last_replayed_lsn,
pg_wal_lsn_diff(pg_current_wal_lsn(), pg_last_wal_receive_lsn()) AS replication_lag
FROM
pg_subscription sub
WHERE sub.subname = '<your_subscription_name>';
Best Practices and Considerations
1. Regular Monitoring:
Keep a close watch on replication slots, WAL senders, and network connections. This will help you identify issues before they affect replication performance.
2. Synchronous vs. Asynchronous Replication:
If data consistency is your top priority, enable synchronous replication, which ensures transactions are acknowledged only after replication. For faster performance, you may opt for asynchronous replication, though it comes with the risk of losing some data if the primary instance fails.
3. Backup Strategy:
Even though replication provides real-time synchronization, it is not a substitute for a proper backup plan. Always maintain regular backups of your database for added security.
4. Network Stability:
Ensure a reliable and secure connection between Cloud SQL and the on-premise server. Any network instability could lead to delays or replication failures.
5. Security Considerations:
While this article touches on basic security measures, it doesn’t cover all aspects of securing a replication setup. It’s essential to follow your organization’s security standards and guidelines, especially regarding network security, data encryption, and authentication. Every environment has unique security requirements, so it’s crucial to collaborate with your IT and security teams to ensure all bases are covered.
Common Pitfalls to Avoid
- Forgetting to Whitelist IPs: Make sure the on-premise server’s IP is allowed to access Cloud SQL. Not doing so will block replication.
- Incorrect User Privileges: Ensure the PostgreSQL user has the appropriate replication privileges, or the replication will not work.
- Replication Slot Limits: Be mindful of how many replication slots are in use. Over-saturating the slots can prevent new replicas from being created.
Conclusion
PostgreSQL replication between Google Cloud SQL and an on-premise server can play a vital role in business continuity planning. This guide outlines the steps needed to set up this replication using Docker as an option for deploying PostgreSQL on the on-premise server. However, you are not limited to using Docker; you can deploy PostgreSQL using any other method you prefer. Keep in mind the best practices, and monitor the replication process regularly to maintain optimal performance.
Additionally, I encourage you to share your unique approach and experience in setting up replication or handling similar tasks. By sharing different methods and lessons learned, we can foster continuous learning and improvement within the community. Every challenge or success provides valuable insights for others, and contributing to this shared knowledge helps us grow together.
Comments
Post a Comment