{"id":177,"date":"2022-07-12T13:43:53","date_gmt":"2022-07-12T11:43:53","guid":{"rendered":"https:\/\/www.hackenberger.at\/blog\/?p=177"},"modified":"2022-10-14T11:11:56","modified_gmt":"2022-10-14T09:11:56","slug":"how-to-postgresql-upgrades-on-large-dbs","status":"publish","type":"post","link":"https:\/\/www.hackenberger.at\/blog\/2022\/07\/12\/how-to-postgresql-upgrades-on-large-dbs\/","title":{"rendered":"How-to: PostgreSQL upgrades on large DBs"},"content":{"rendered":"\n<p>TL;DR; Use <a href=\"https:\/\/github.com\/2ndQuadrant\/pglogical\">pglogical<\/a> to upgrade between major versions of PostgreSQL with minimal downtime when running very large DB installations.<\/p>\n\n\n\n<p>For databases beyond a couple of GB in size, an upgrade between major versions of PostgreSQL would need to use pg_basebackup and pg_upgrade, but at a size of around 1TB, even this means too long of a downtime for a SaaS application. The solution is to use logical asynchronous replication to create a hot-standby running a newer version of PostgreSQL. pglogical is an extension that works well for this, down to PostgreSQL version 9.4.<\/p>\n\n\n\n<p>The upgrade procedure we used consists roughly of the following steps:<\/p>\n\n\n\n<ul><li>Prepare the new DB (primary keys, schema, users)<\/li><li>Install pglogical into the DB<\/li><li>Drop indices on large tables<\/li><li>Synchronise the data<\/li><li>Re-create indices and sync sequences<\/li><li>Switch over<\/li><\/ul>\n\n\n\n<p>At <a href=\"https:\/\/www.gps-infostars.com\/\">Infostars<\/a> we upgraded the ~1TB Postgresql 9.4 database with multiple extensions (PostGIS, audit) to version 13 with a minimal downtime window of &lt; 10 minutes. The production set-up consists of two DB servers (hot standby) with pgpool2 in front of the DB.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Preparation<\/h2>\n\n\n\n<p>As a preparation, we set-up a second set of master\/standby servers and provisioned them using <a href=\"https:\/\/www.ansible.com\/\">ansible<\/a>. We use ansible to set-up the docker host and copy over a docker-compose set-up, which contains the backend, cron jobs and HTTP servers for the frontend.<\/p>\n\n\n\n<p>In order to be able to use pglogical, it&#8217;s important that every table has a primary key constraint. The following query shows tables which are missing the primary key:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>sudo -u postgres psql $DBNAME -c \"SELECT\n  n.nspname as schema,\n  c.relname as table\nFROM pg_class c\nJOIN pg_namespace n ON n.oid = c.relnamespace\nWHERE c.relkind = 'r'\nAND NOT EXISTS (\n  SELECT 1 FROM pg_constraint con\n  WHERE con.conrelid = c.oid AND con.contype = 'p'\n)\nAND n.nspname &lt;&gt; ALL (\n  ARRAY &#91;'pg_catalog', 'sys', 'dbo', 'information_schema']\n);\"<\/code><\/pre>\n\n\n\n<p>We had to fix the schema a little, as there were some old tables left over from the days we used hibernate auto update, instead of <a href=\"https:\/\/www.liquibase.org\/\">liquibase<\/a>.<\/p>\n\n\n\n<p>In order to connect the new to the old servers, we use an OpenVPN tunnel. The servers have the following internal IPs that you&#8217;ll find in the scripts below:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>10.56.34.2  New master DB server\n10.56.34.1  Old master DB server<\/code><\/pre>\n\n\n\n<p>In order to make the migration a little easier, I set-up password-less SSH access from the new server to the old one. I also use a few environment variables in the scripts below, to make it more readable:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>DBNAME=YOURDBNAME\nDBUSER=YOURREGULARAPPDBUSER\nSSHOLD='ssh root@10.56.34.1'\nPSQLOLD=\"ssh root@10.56.34.1 sudo -u postgres psql\"\nPGLPASS=\"YOURSECRETPASSWORDFORPGLOGICAL\"<\/code><\/pre>\n\n\n\n<p>We have all the data in a single database and use two namespaces (<em>public<\/em> and <em>audit<\/em>). We do use different database users to connect to the DB.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Copying the DB schema<\/h2>\n\n\n\n<p>Export the schema and DB users to files:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ $SSHOLD sudo -u postgres pg_dump -Fc -C -s $DBNAME &gt; $DBNAME_schema.dmp\n$ $SSHOLD sudo -u postgres pg_dumpall -g &gt; globals.sql<\/code><\/pre>\n\n\n\n<p><em>DBNAME_schema.dmp <\/em>now contains a PostgreSQL binary dump of only the DB schema and globals.sql a text dump of the DB users. Now make sure your new DB postgresql cluster is up and running. In our case, we only started the master DB and kept the hot-standby off until after the initial synchronisation run.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ sudo -u postgres psql &lt; globals.sql\n$ sudo -u postgres psql -c \"CREATE DATABASE $DBNAME WITH TEMPLATE = template0 ENCODING = 'UTF8' LC_COLLATE = 'en_US.UTF-8' LC_CTYPE = 'en_US.UTF-8';\"\n$ sudo -u postgres pg_restore -F c -d $DBNAME &lt; $DBNAME_schema.dmp<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">Set-up pglogical<\/h2>\n\n\n\n<p>Make sure the pglogical packages are installed on both the old and the new server. Then we set-up a config file with the required parameters for pglogical, as mentioned in the <a href=\"https:\/\/github.com\/2ndQuadrant\/pglogical\">quick set-up<\/a>.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ PGDATA=\/etc\/postgresql\/9.4\/main\/\n$ echo \"include 'pglogical.conf'\" | $SSHOLD \"tee -a $PGDATA\/postgresql.conf\";\n$ echo -e \"wal_level = 'logical'\\nmax_worker_processes = 10\\nmax_replication_slots = 10\\nmax_wal_senders = 10\\nshared_preload_libraries = 'pglogical'\" | $SSOLD \"tee $PGDATA\/pglogical.conf\"\n$ $SSHOLD \"sudo systemctl restart postgresql@9.4-main\"\n\n$ PGDATA=\/etc\/postgresql\/13\/main\/\necho \"include 'pglogical.conf'\" &gt;&gt; $PGDATA\/postgresql.conf; echo -e \"wal_level = 'logical'\\nmax_worker_processes = 10\\nmax_replication_slots = 10\\nmax_wal_senders = 10\\nshared_preload_libraries = 'pglogical'\" &gt;&gt; $PGDATA\/pglogical.conf\n$ sudo systemctl restart postgresql@13-main<\/code><\/pre>\n\n\n\n<p>We need to add the pglogical extension to the DB we want to replicate:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ $PSQLOLD $DBNAME -c '\"CREATE EXTENSION pglogical_origin;\"' # Only for postgresql 9.4\n$ $PSQLOLD $DBNAME -c '\"CREATE EXTENSION pglogical;\"'\nsudo -u postgres psql $DBNAME -c \"CREATE EXTENSION pglogical;\"<\/code><\/pre>\n\n\n\n<p>And we need to grant permissions to allow schema changes through liquibase:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ $PSQLOLD $DBNAME -c \"\\\"GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA pglogical TO $DBUSER;\\\"\"\n$ $PSQLOLD $DBNAME -c \"\\\"GRANT USAGE ON SCHEMA pglogical TO $DBUSER;\\\"\"\n$ $PSQLOLD -c \"\\\"CREATE ROLE pglogical LOGIN REPLICATION SUPERUSER ENCRYPTED PASSWORD '$PGLPASS';\\\"\"\n$ sudo -u postgres psql -c \"CREATE ROLE pglogical LOGIN REPLICATION SUPERUSER ENCRYPTED PASSWORD '$PGLPASS';\"<\/code><\/pre>\n\n\n\n<p>And the pglogical extension needs to have access to the DB. <strong>ATTENTION<\/strong>: Replace 10.56.24.0\/24 with actual IP range of the peer and make sure the subscriber can access their own DB through the IP as well.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>echo -e \"# TODO Added for DB upgrade using pglogical remove after done\\nlocal\\treplication\\tpglogical\\ttrust\\nhost\\treplication\\tpglogical\\t10.56.24.0\/24\\tmd5\\nlocal\\t$DBNAME\\tpglogical\\ttrust\\nhost\\t$DBNAME\\tpglogical\\t10.56.24.0\/24\\tmd5\" | $SSHOLD tee -a \/etc\/postgresql\/9.4\/main\/pg_hba.conf &gt; \/dev\/null\necho -e \"# TODO Added for DB upgrade using pglogical remove after done\\nlocal\\treplication\\tpglogical\\ttrust\\nhost\\treplication\\tpglogical\\t10.56.24.0\/24\\tmd5\\nlocal\\t$DBNAME\\tpglogical\\ttrust\\nhost\\t$DBNAME\\tpglogical\\t10.56.24.0\/24\\tmd5\" | tee -a \/etc\/postgresql\/13\/main\/pg_hba.conf &gt; \/dev\/null<\/code><\/pre>\n\n\n\n<p>Now reload PostgreSQL to apply the access changes. <strong>ATTENTION<\/strong>: Make sure you have log_min_messages in postgresql.conf (or in a conf.d file) set to &#8216;info&#8217; as a minimum, or you won&#8217;t see why initial synchronisation fails later on.<\/p>\n\n\n\n<pre id=\"block-2bb0e852-8ca0-446c-a93c-e7378ab1c7c9\" class=\"wp-block-code\"><code>$ $SSHOLD systemctl reload postgresql@9.4-main.service\n$ systemctl reload postgresql@13-main.service<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">Optional: Traffic shaping<\/h2>\n\n\n\n<p>As we are using the same network interface for the OpenVPN connection that&#8217;s also connected to the WAN (i.e. clients using the server), I enabled traffic shaping using wondershaper:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ $SSHOLD apt install wondershaper\n$ $SSHOLD wondershaper -a tun2 -u 215040 -d 215040<\/code><\/pre>\n\n\n\n<p>tun2 is the OpenVPN network interface and I limited the bandwith to 210MBit\/s (210 * 1024).<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Dropping indices on the new DB<\/h2>\n\n\n\n<p>As pglogical is using the SQL <em>COPY<\/em> command to copy over data, the new DB would build the indices up while the data is being replicated. For large tables that would take ages and delay the initial copy phase. It&#8217;s way for efficient to drop the indices on the new DB before the initial replication and build them concurrently afterwards.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ sudo -u postgres pg_dump -s -t LARGE_TABLE_NAME $DBNAME &gt; LARGE_TABLE_schema.sql\n$ grep INDEX LARGE_TABLE_schema.sql | sed 's\/^.*INDEX \\(&#91;^ ]*\\).*\/DROP INDEX \\1;\/' | sudo -u postgres psql $DBNAME<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">Creating the replication set<\/h2>\n\n\n\n<p>pglogical supports several replications sets, but we&#8217;ll just use the <em>default<\/em> as it&#8217;s meant for INSERT, UPDATE, DELETE and therefore fits our purpose. The pglogical extension needs to have one provider <em>node<\/em>  and one subscriber <em>node<\/em> created, on the provider\/subscriber respectively. The <em>node<\/em> as well as the <em>subscription<\/em> (refers to a replication set) need a postgresql DSN (how to connect) string as a parameter.<\/p>\n\n\n\n<p>We are using two namespaces: <em>public<\/em> and <em>audit<\/em> (from the <a href=\"https:\/\/wiki.postgresql.org\/wiki\/Audit_trigger\">audit trigger<\/a>) and need to exclude the table <em>spatial_ref_sys<\/em> from the <a href=\"https:\/\/postgis.net\">PostGIS<\/a> extension, as the table is automatically filled when the PostGIS extension is created (while we restored the schema to the new DB above). So you&#8217;ll have to adapt the commands here a little (also the IPs)<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ $PSQLOLD $DBNAME -c \"\\\"SELECT pglogical.create_node(node_name := 'provider', dsn := 'host=10.56.34.1 port=5432 dbname=$DBNAME user=pglogical');\\\"\"\n$ $PSQLOLD $DBNAME -c \"\\\"SELECT pglogical.replication_set_add_all_tables('default', '{public}'::text&#91;]);\\\"\"\n$ $PSQLOLD $DBNAME -c \"\\\"SELECT pglogical.replication_set_add_all_tables('default', '{audit}'::text&#91;]);\\\"\"\n$ $PSQLOLD $DBNAME -c \"\\\"SELECT pglogical.replication_set_add_all_sequences(set_name := 'default', schema_names := '{public}'::text&#91;], synchronize_data := true )\\\"\"\n$ $PSQLOLD $DBNAME -c \"\\\"SELECT pglogical.replication_set_add_all_sequences(set_name := 'default', schema_names := '{audit}'::text&#91;], synchronize_data := true )\\\"\"\n$ $PSQLOLD $DBNAME -c \"\\\"SELECT pglogical.replication_set_remove_table(set_name := 'default', relation := 'spatial_ref_sys')\\\"\"<\/code><\/pre>\n\n\n\n<p>If you made a mistake, you can remove all tables \/ sequences in a replication set like this:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>SELECT pglogical.replication_set_remove_table('default', set_reloid) FROM pglogical.replication_set_table;\nSELECT pglogical.replication_set_remove_sequence('default', set_seqoid) FROM pglogical.replication_set_seq;<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">Start the replication<\/h2>\n\n\n\n<p>Now create the subscriber node and the actual subscription (starts the replication):<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ sudo -u postgres psql $DBNAME -c \"SELECT pglogical.create_node(node_name := 'subscriber', dsn := 'host=10.56.34.2 port=5432 dbname=$DBNAME user=pglogical password=$PGLPASS');\"\n$ sudo -u postgres psql $DBNAME -c \"SELECT pglogical.create_subscription(subscription_name := 'subscription', provider_dsn := 'host=10.56.34.1 port=5432 dbname=$DBNAME user=pglogical password=$PGLPASS', replication_sets := '{default}'::text&#91;] );\"<\/code><\/pre>\n\n\n\n<p>As you can see the <em>connection string (DSN)<\/em> for the node refers to the new primary DB host, whereas the connection string for the subscriptions refers to the old primary DB host.<\/p>\n\n\n\n<p>Now look at the logs on the new primary. As mentioned above, make sure your <em>log_min_messages<\/em> parameter in postgresql.conf (or in a conf.d file) is set to &#8216;info&#8217; as a minimum, or you won&#8217;t see why initial synchronisation fails. You can ignore messages like <em>WARNING: snapshot 0x55a9335acdc8 still active.<\/em><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>tail -f -n400 \/var\/log\/postgresql\/postgresql-13-main.log | grep -v -e 'WARNING:  snapshot' -e 'apply COMMIT in commit'<\/code><\/pre>\n\n\n\n<p>If you like you can ask postgresql to show the status of the WAL replication (on the provider \/ old primary) and wait for initial synchronisation to complete (on the subscriber \/ new primary):<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ $PSQLOLD -c '\"SELECT * from pg_stat_replication;\"'\n$ sudo -u postgres psql $DBNAME -c \"SELECT pglogical.wait_for_subscription_sync_complete('subscription');\"<\/code><\/pre>\n\n\n\n<p>Once the query returns, you should have the pglogical replication in <em>replication<\/em> state (2nd return column):<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ sudo -u postgres psql $DBNAME -c \"SELECT pglogical.show_subscription_status('subscription')\"<\/code><\/pre>\n\n\n\n<p>You can check whether you got all data from large tables by getting the largest ID and then doing the same query on both machines. This query uses a primary key index to be reasonably fast, a regular count(id) would do a full table scan.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>BEGIN TRANSACTION; SELECT count(*) FROM (SELECT DISTINCT id FROM YOUR_LARGE_TABLE WHERE id &lt; YOUR_MAX_ID) t; ROLLBACK;<\/code><\/pre>\n\n\n\n<p>Please be-ware that the replication sequences is a bit special. I handled them manually. See below.<\/p>\n\n\n\n<p>In order to show subscripts, replication sets, and nodes just run a select on the respective tables:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>SELECT * FROM pglogical.subscription;\nSELECT * FROM pglogical.replication_set;\nSELECT * FROM pglogical.node;<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">Re-create indices<\/h2>\n\n\n\n<p>Now re-create all indices that you&#8217;ve dropped before, concurrently on the subscriber:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ grep INDEX LARGE_TABLE_schema.sql | sed 's\/INDEX \/INDEX CONCURRENTLY \/' | sudo -u postgres psql $DBNAME<\/code><\/pre>\n\n\n\n<p>and wait for all indices to become active. You can check with <em>\\d LARGE_TABLE<\/em> which will show the index as <em>INVALID<\/em> while it&#8217;s still being built. See also <a href=\"https:\/\/www.postgresql.org\/docs\/13\/sql-createindex.html#SQL-CREATEINDEX-CONCURRENTLY\">PostgreSQL CREATE INDEX<\/a>.<\/p>\n\n\n\n<p>During this time, you can ignore the following errors:<br>ERROR: could not read block 0 in file &#8220;base\/16387\/49241&#8221;: read only 0 of 8192 bytes<br>LOG: apply worker [1210644] at slot 1 generation 8 exiting with error<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Switching over<\/h2>\n\n\n\n<p>Before we did the final switching over, we activated our new hot standby server. We&#8217;ve set-up pgpool2 in a way that this is a single command. See <a href=\"https:\/\/www.pgpool.net\/docs\/42\/en\/html\/runtime-online-recovery.html\">pgpool-II Online recovery<\/a> for instructions.<\/p>\n\n\n\n<p>The procedure of switching over is highly application specific. In our case it consisted of:<\/p>\n\n\n\n<ul><li>Disabling writes on the old primary application server<\/li><li>Syncing sequences to the new primary DB server<\/li><li>Stopping pglogical and shutting down the old DB server<\/li><li>Bringing up the app backend stack on the new primary server<\/li><li>Pointing our floating IP to the new primary server<\/li><\/ul>\n\n\n\n<p>One weak spot with pglogical are sequences. Contrary to the docs, they are never automatically synced in my set-up. I&#8217;ve been able to get <em>pglogical.synchronize_sequence<\/em> to work to sync them manually, but it adds 1000 to the last value, as documented. This is not what I want however, so I synced them manually as a part of switching the production system to the new DB using the following commands. Please make sure you do this <em>after<\/em> you disabled writing to the old primary DB and <em>before<\/em> you enable writing to the new primary DB.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># ATTENTION: Only includes the 'public' namespace, use e.g '\\ds audit.*' to list sequences in other namespaces\n$ sudo -u postgres psql $DBNAME -t -A -c \"\\ds\" | awk -F '|' '{ print $2 }' | sort &gt; sequences.txt\n$ cat sequences.txt | while read seq; do lastval=$(sudo -u postgres psql $DBNAME -t -A -c \"SELECT last_value FROM $seq\" 2&gt;\/dev\/null); echo $seq $lastval; done &gt; sequence-vals.txt\n# on subscriber\n$ cat sequence-vals.txt | while read seq val; do sudo -u postgres psql $DBNAME -c \"SELECT setval('$seq', $val);\"; done<\/code><\/pre>\n\n\n\n<p>After disabling writing to the old DB, you stop the pglogical replication and then shut down the old primary DB just to be safe:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ sudo -u postgres psql $DBNAME -c \"SELECT pglogical.drop_subscription('subscription');\"\n$ sudo -u postgres psql $DBNAME -c \"SELECT pglogical.drop_node(node_name := 'subscriber');\"\n$ $SSHOLD systemctl stop postgresql@9.4-main<\/code><\/pre>\n\n\n\n<p>For our own migration, at this point we would bring up pgpool2, all the docker services to re-enable our backend and point our floating IP to the new server.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ echo DONE<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>TL;DR; Use pglogical to upgrade between major versions of PostgreSQL with minimal downtime when running very large DB installations. For databases beyond a couple of GB in size, an upgrade between major versions of PostgreSQL would need to use pg_basebackup and pg_upgrade, but at a size of around 1TB, even this means too long of &hellip; <a href=\"https:\/\/www.hackenberger.at\/blog\/2022\/07\/12\/how-to-postgresql-upgrades-on-large-dbs\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">How-to: PostgreSQL upgrades on large DBs<\/span> <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[8,7],"views":894,"_links":{"self":[{"href":"https:\/\/www.hackenberger.at\/blog\/wp-json\/wp\/v2\/posts\/177"}],"collection":[{"href":"https:\/\/www.hackenberger.at\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.hackenberger.at\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.hackenberger.at\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.hackenberger.at\/blog\/wp-json\/wp\/v2\/comments?post=177"}],"version-history":[{"count":7,"href":"https:\/\/www.hackenberger.at\/blog\/wp-json\/wp\/v2\/posts\/177\/revisions"}],"predecessor-version":[{"id":219,"href":"https:\/\/www.hackenberger.at\/blog\/wp-json\/wp\/v2\/posts\/177\/revisions\/219"}],"wp:attachment":[{"href":"https:\/\/www.hackenberger.at\/blog\/wp-json\/wp\/v2\/media?parent=177"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.hackenberger.at\/blog\/wp-json\/wp\/v2\/categories?post=177"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.hackenberger.at\/blog\/wp-json\/wp\/v2\/tags?post=177"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}