4.6. Backing up PostgreSQL data
To back up PostgreSQL data, use one of the following approaches:
- SQL dump
- See Backing up with SQL dump.
- File system level backup
- See File system level backup.
- Continuous archiving
- See Continuous archiving.
4.6.1. Backing up PostgreSQL data with an SQL dump
The SQL dump method is based on generating a dump file with SQL commands. When a dump is uploaded back to the database server, it recreates the database in the same state as it was at the time of the dump.
The SQL dump is ensured by the following PostgreSQL client applications:
- pg_dump dumps a single database without cluster-wide information about roles or tablespaces
- pg_dumpall dumps each database in a given cluster and preserves cluster-wide data, such as role and tablespace definitions.
By default, the pg_dump
and pg_dumpall
commands write their results into the standard output. To store the dump in a file, redirect the output to an SQL file. The resulting SQL file can be either in a text format or in other formats that allow for parallelism and for more detailed control of object restoration.
You can perform the SQL dump from any remote host that has access to the database.
4.6.1.1. Advantages and disadvantages of an SQL dump
An SQL dump has the following advantages compared to other PostgreSQL backup methods:
- An SQL dump is the only PostgreSQL backup method that is not server version-specific. The output of the pg_dump utility can be reloaded into later versions of PostgreSQL, which is not possible for file system level backups or continuous archiving.
- An SQL dump is the only method that works when transferring a database to a different machine architecture, such as going from a 32-bit to a 64-bit server.
- An SQL dump provides internally consistent dumps. A dump represents a snapshot of the database at the time pg_dump began running.
- The pg_dump utility does not block other operations on the database when it is running.
A disadvantage of an SQL dump is that it takes more time compared to file system level backup.
4.6.1.2. Performing an SQL dump using pg_dump
To dump a single database without cluster-wide information, use the pg_dump utility.
Conditions préalables
-
You must have read access to all tables that you want to dump. To dump the entire database, you must run the commands as the
postgres
superuser or a user with database administrator privileges.
Procédure
Dump a database without cluster-wide information:
$ pg_dump dbname > dumpfile
To specify which database server pg_dump will contact, use the following command-line options:
The
-h
option to define the host.The default host is either the local host or what is specified by the
PGHOST
environment variable.The
-p
option to define the port.The default port is indicated by the
PGPORT
environment variable or the compiled-in default.
4.6.1.3. Performing an SQL dump using pg_dumpall
To dump each database in a given database cluster and to preserve cluster-wide data, use the pg_dumpall utility.
Conditions préalables
-
You must run the commands as the
postgres
superuser or a user with database administrator privileges.
Procédure
Dump all databases in the database cluster and preserve cluster-wide data:
$ pg_dumpall > dumpfile
To specify which database server pg_dumpall will contact, use the following command-line options:
The
-h
option to define the host.The default host is either the local host or what is specified by the
PGHOST
environment variable.The
-p
option to define the port.The default port is indicated by the
PGPORT
environment variable or the compiled-in default.The
-l
option to define the default database.This option enables you to choose a default database different from the
postgres
database created automatically during initialization.
4.6.1.4. Restoring a database dumped using pg_dump
To restore a database from an SQL dump that you dumped using the pg_dump utility, follow the steps below.
Conditions préalables
-
You must run the commands as the
postgres
superuser or a user with database administrator privileges.
Procédure
Create a new database:
$ createdb dbname
- Verify that all users who own objects or were granted permissions on objects in the dumped database already exist. If such users do not exist, the restore fails to recreate the objects with the original ownership and permissions.
Run the psql utility to restore a text file dump created by the pg_dump utility:
$ psql dbname < dumpfile
where
dumpfile
is the output of thepg_dump
command. To restore a non-text file dump, use thepg_restore
utility instead:$ pg_restore non-plain-text-file
4.6.1.5. Restoring databases dumped using pg_dumpall
To restore data from a database cluster that you dumped using the pg_dumpall utility, follow the steps below.
Conditions préalables
-
You must run the commands as the
postgres
superuser or a user with database administrator privileges.
Procédure
- Ensure that all users who own objects or were granted permissions on objects in the dumped databases already exist. If such users do not exist, the restore fails to recreate the objects with the original ownership and permissions.
Run the psql utility to restore a text file dump created by the pg_dumpall utility:
$ psql < dumpfile
where
dumpfile
is the output of thepg_dumpall
command.
4.6.1.6. Performing an SQL dump of a database on another server
Dumping a database directly from one server to another is possible because pg_dump and psql can write to and read from pipes.
Procédure
To dump a database from one server to another, run:
$ pg_dump -h host1 dbname | psql -h host2 dbname
4.6.1.7. Handling SQL errors during restore
By default, psql continues to execute if an SQL error occurs, causing the database to restore only partially.
To change the default behavior, use one of the following approaches when restoring a dump.
Conditions préalables
-
You must run the commands as the
postgres
superuser or a user with database administrator privileges.
Procédure
Make psql exit with an exit status of 3 if an SQL error occurs by setting the
ON_ERROR_STOP
variable:$ psql --set ON_ERROR_STOP=on dbname < dumpfile
Specify that the whole dump is restored as a single transaction so that the restore is either fully completed or canceled.
When restoring a text file dump using the
psql
utility:$ psql -1
When restoring a non-text file dump using the
pg_restore
utility:$ pg_restore -e
Note that when using this approach, even a minor error can cancel a restore operation that has already run for many hours.
4.6.1.8. Ressources supplémentaires
4.6.2. Backing up PostgreSQL data with a file system level backup
To create a file system level backup, copy PostgreSQL database files to another location. For example, you can use any of the following approaches:
- Create an archive file using the tar utility.
- Copy the files to a different location using the rsync utility.
- Create a consistent snapshot of the data directory.
4.6.2.1. Advantages and limitations of file system backing up
File system level backing up has the following advantage compared to other PostgreSQL backup methods:
- File system level backing up is usually faster than an SQL dump.
File system level backing up has the following limitations compared to other PostgreSQL backup methods:
- This backing up method is not suitable when you want to upgrade from RHEL 8 to RHEL 9 and migrate your data to the upgraded system. File system level backup is specific to an architecture and a RHEL major version. You can restore your data on your RHEL 8 system if the upgrade is not successful but you cannot restore the data on a RHEL 9 system.
- The database server must be shut down before backing up and restoring data.
- Backing up and restoring certain individual files or tables is impossible. Backing up a file system works only for complete backing up and restoring of an entire database cluster.
4.6.2.2. Performing file system level backing up
To perform file system level backing up, use the following procedure.
Procédure
Choose the location of a database cluster and initialize this cluster:
# postgresql-setup --initdb
Stop the postgresql service:
# systemctl stop postgresql.service
Use any method to create a file system backup, for example a
tar
archive:$ tar -cf backup.tar /var/lib/pgsql/data
Start the postgresql service:
# systemctl start postgresql.service
Ressources supplémentaires
4.6.3. Backing up PostgreSQL data by continuous archiving
4.6.3.1. Introduction to continuous archiving
PostgreSQL records every change made to the database’s data files into a write ahead log (WAL) file that is available in the pg_wal/
subdirectory of the cluster’s data directory. This log is intended primarily for a crash recovery. After a crash, the log entries made since the last checkpoint can be used for restoring the database to a consistency.
The continuous archiving method, also known as an online backup, combines the WAL files with a copy of the database cluster in the form of a base backup performed on a running server or a file system level backup.
If a database recovery is needed, you can restore the database from the copy of the database cluster and then replay log from the backed up WAL files to bring the system to the current state.
With the continuous archiving method, you must keep a continuous sequence of all archived WAL files that extends at minimum back to the start time of your last base backup. Therefore the ideal frequency of base backups depends on:
- The storage volume available for archived WAL files.
- The maximum possible duration of data recovery in situations when recovery is necessary. In cases with a long period since the last backup, the system replays more WAL segments, and the recovery therefore takes more time.
You cannot use pg_dump and pg_dumpall SQL dumps as a part of a continuous archiving backup solution. SQL dumps produce logical backups and do not contain enough information to be used by a WAL replay.
To perform a database backup and restore using the continuous archiving method, follow these instructions:
- Set up and test your procedure for archiving WAL files - see WAL archiving.
- Perform a base backup - see base backup.
To restore your data, follow instructions in Restoring database with continuous archiving.
4.6.3.2. Advantages and disadvantages of continuous archiving
Continuous archiving has the following advantages compared to other PostgreSQL backup methods:
- With the continuous backup method, it is possible to use a base backup that is not entirely consistent because any internal inconsistency in the backup is corrected by the log replay. Therefore you can perform a base backup on a running PostgreSQL server.
-
A file system snapshot is not needed;
tar
or a similar archiving utility is sufficient. - Continuous backup can be achieved by continuing to archive the WAL files because the sequence of WAL files for the log replay can be indefinitely long. This is particularly valuable for large databases.
- Continuous backup supports point-in-time recovery. It is not necessary to replay the WAL entries to the end. The replay can be stopped at any point and the database can be restored to its state at any time since the base backup was taken.
- If the series of WAL files are continuously available to another machine that has been loaded with the same base backup file, it is possible to restore the other machine with a nearly-current copy of the database at any point.
Continuous archiving has the following disadvantages compared to other PostgreSQL backup methods:
- Continuous backup method supports only restoration of an entire database cluster, not a subset.
- Continuous backup requires extensive archival storage.
4.6.3.3. Setting up WAL archiving
A running PostgreSQL server produces a sequence of write ahead log (WAL) records. The server physically divides this sequence into WAL segment files, which are given numeric names that reflect their position in the WAL sequence. Without WAL archiving, the segment files are reused and renamed to higher segment numbers.
When archiving WAL data, the contents of each segment file are captured and saved at a new location before the segment file is reused. You have multiple options where to save the content, such as an NFS-mounted directory on another machine, a tape drive, or a CD.
Note that WAL records do not include changes to configuration files.
To enable WAL archiving, use the following procedure.
Procédure
In the
/var/lib/pgsql/data/postgresql.conf
file:-
Set the
wal_level
configuration parameter toreplica
or higher. -
Set the
archive_mode
parameter toon
. -
Specify the shell command in the
archive_command
configuration parameter. You can use thecp
command, another command, or a shell script.
-
Set the
Restart the
postgresql
service to enable the changes:# systemctl restart postgresql.service
- Test your archive command and ensure it does not overwrite an existing file and that it returns a non-zero exit status if it fails.
- To protect your data, ensure that the segment files are archived into a directory that does not have group or world read access.
The archive command is executed only on completed WAL segments. A server that generates little WAL traffic can have a substantial delay between the completion of a transaction and its safe recording in archive storage. To limit how old unarchived data can be, you can:
-
Set the
archive_timeout
parameter to force the server to switch to a new WAL segment file with a given frequency. -
Use the
pg_switch_wal
parameter to force a segment switch to ensure that a transaction is archived immediately after it finishes.
Exemple 4.5. Shell command for archiving WAL segments
This example shows a simple shell command you can set in the archive_command
configuration parameter.
The following command copies a completed segment file to the required location:
archive_command = 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f'
where the %p
parameter is replaced by the relative path to the file to archive and the %f
parameter is replaced by the file name.
This command copies archivable WAL segments to the /mnt/server/archivedir/
directory. After replacing the %p
and %f
parameters, the executed command looks as follows:
test ! -f /mnt/server/archivedir/00000001000000A900000065 && cp pg_wal/00000001000000A900000065 /mnt/server/archivedir/00000001000000A900000065
A similar command is generated for each new file that is archived.
Ressources supplémentaires
4.6.3.4. Making a base backup
You can create a base backup in several ways. This section describes the simplest way of performing a base backup using the pg_basebackup utility on a running PostgreSQL server.
The base backup process creates a backup history file that is stored into the WAL archive area and is named after the first WAL segment file that you need for the base backup.
The backup history file is a small text file containing the starting and ending times, and WAL segments of the backup. If you used the label string to identify the associated dump file, you can use the backup history file to determine which dump file to restore.
Consider keeping several backup sets to be certain that you can recover your data.
To perform a base backup, use the following procedure.
Conditions préalables
-
You must run the commands as the
postgres
superuser, a user with database administrator privileges, or another user with at leastREPLICATION
permissions. - You must keep all the WAL segment files generated during and after the base backup.
Procédure
Use the
pg_basebackup
utility to perform the base backup.To create a base backup as individual files (plain format):
$ pg_basebackup -D backup_directory -Fp
Replace backup_directory with your desired backup location.
If you use tablespaces and perform the base backup on the same host as the server, you must also use the
--tablespace-mapping
option, otherwise the backup will fail upon an attempt to write the backup to the same location.To create a base backup as a
tar
archive (tar
and compressed format):$ pg_basebackup -D backup_directory -Ft -z
Replace backup_directory with your desired backup location.
To restore such data, you must manually extract the files in the correct locations.
- After the base backup process is complete, safely archive the copy of the database cluster and the WAL segment files used during the backup, which are specified in the backup history file.
- Delete WAL segments numerically lower than the WAL segment files used in the base backup because these are older than the base backup and no longer needed for a restore.
To specify which database server pg_basebackup will contact, use the following command-line options:
The
-h
option to define the host.The default host is either the local host or a host specified by the
PGHOST
environment variable.The
-p
option to define the port.The default port is indicated by the
PGPORT
environment variable or the compiled-in default.
Ressources supplémentaires
4.6.3.5. Restoring the database using a continuous archive backup
To restore a database using a continuous backup, use the following procedure.
Procédure
Stop the server:
# systemctl stop postgresql.service
Copy the necessary data to a temporary location.
Preferably, copy the whole cluster data directory and any tablespaces. Note that this requires enough free space on your system to hold two copies of your existing database.
If you do not have enough space, save the contents of the cluster’s
pg_wal
directory, which can contain logs that were not archived before the system went down.- Remove all existing files and subdirectories under the cluster data directory and under the root directories of any tablespaces you are using.
Restore the database files from your base backup.
Ensure that:
-
The files are restored with the correct ownership (the database system user, not
root
). - The files are restored with the correct permissions.
-
The symbolic links in the
pg_tblspc/
subdirectory are restored correctly.
-
The files are restored with the correct ownership (the database system user, not
Remove any files present in the
pg_wal/
subdirectory.These files resulted from the base backup and are therefore obsolete. If you did not archive
pg_wal/
, recreate it with proper permissions.-
Copy any unarchived WAL segment files that you saved in step 2 into
pg_wal/
. Create the
recovery.conf
recovery command file in the cluster data directory and specify the shell command in therestore_command
configuration parameter. You can use thecp
command, another command, or a shell script. For example:restore_command = 'cp /mnt/server/archivedir/%f "%p"'
Start the server:
# systemctl start postgresql.service
The server will enter the recovery mode and proceed to read through the archived WAL files that it needs.
If the recovery is terminated due to an external error, the server can be restarted and it will continue the recovery. When the recovery process is completed, the server renames
recovery.conf
torecovery.done
. This prevents the server from accidental re-entering the recovery mode after it starts normal database operations.Check the contents of the database to verify that the database has recovered into the required state.
If the database has not recovered into the required state, return to step 1. If the database has recovered into the required state, allow the users to connect by restoring the client authentication configuration in the
pg_hba.conf
file.
For more information about restoring using the continuous backup, see PostgreSQL Documentation.