Rechercher

4.6. Backing up PostgreSQL data

download PDF

To back up PostgreSQL data, use one of the following approaches:

SQL dump
See Backing up with SQL dump.
File system level backup
See File system level backup.
Continuous archiving
See Continuous archiving.

4.6.1. Backing up PostgreSQL data with an SQL dump

The SQL dump method is based on generating a dump file with SQL commands. When a dump is uploaded back to the database server, it recreates the database in the same state as it was at the time of the dump.

The SQL dump is ensured by the following PostgreSQL client applications:

  • pg_dump dumps a single database without cluster-wide information about roles or tablespaces
  • pg_dumpall dumps each database in a given cluster and preserves cluster-wide data, such as role and tablespace definitions.

By default, the pg_dump and pg_dumpall commands write their results into the standard output. To store the dump in a file, redirect the output to an SQL file. The resulting SQL file can be either in a text format or in other formats that allow for parallelism and for more detailed control of object restoration.

You can perform the SQL dump from any remote host that has access to the database.

4.6.1.1. Advantages and disadvantages of an SQL dump

An SQL dump has the following advantages compared to other PostgreSQL backup methods:

  • An SQL dump is the only PostgreSQL backup method that is not server version-specific. The output of the pg_dump utility can be reloaded into later versions of PostgreSQL, which is not possible for file system level backups or continuous archiving.
  • An SQL dump is the only method that works when transferring a database to a different machine architecture, such as going from a 32-bit to a 64-bit server.
  • An SQL dump provides internally consistent dumps. A dump represents a snapshot of the database at the time pg_dump began running.
  • The pg_dump utility does not block other operations on the database when it is running.

A disadvantage of an SQL dump is that it takes more time compared to file system level backup.

4.6.1.2. Performing an SQL dump using pg_dump

To dump a single database without cluster-wide information, use the pg_dump utility.

Conditions préalables

  • You must have read access to all tables that you want to dump. To dump the entire database, you must run the commands as the postgres superuser or a user with database administrator privileges.

Procédure

  • Dump a database without cluster-wide information:

    $ pg_dump dbname > dumpfile

To specify which database server pg_dump will contact, use the following command-line options:

  • The -h option to define the host.

    The default host is either the local host or what is specified by the PGHOST environment variable.

  • The -p option to define the port.

    The default port is indicated by the PGPORT environment variable or the compiled-in default.

4.6.1.3. Performing an SQL dump using pg_dumpall

To dump each database in a given database cluster and to preserve cluster-wide data, use the pg_dumpall utility.

Conditions préalables

  • You must run the commands as the postgres superuser or a user with database administrator privileges.

Procédure

  • Dump all databases in the database cluster and preserve cluster-wide data:

    $ pg_dumpall > dumpfile

To specify which database server pg_dumpall will contact, use the following command-line options:

  • The -h option to define the host.

    The default host is either the local host or what is specified by the PGHOST environment variable.

  • The -p option to define the port.

    The default port is indicated by the PGPORT environment variable or the compiled-in default.

  • The -l option to define the default database.

    This option enables you to choose a default database different from the postgres database created automatically during initialization.

4.6.1.4. Restoring a database dumped using pg_dump

To restore a database from an SQL dump that you dumped using the pg_dump utility, follow the steps below.

Conditions préalables

  • You must run the commands as the postgres superuser or a user with database administrator privileges.

Procédure

  1. Create a new database:

    $ createdb dbname
  2. Verify that all users who own objects or were granted permissions on objects in the dumped database already exist. If such users do not exist, the restore fails to recreate the objects with the original ownership and permissions.
  3. Run the psql utility to restore a text file dump created by the pg_dump utility:

    $ psql dbname < dumpfile

    where dumpfile is the output of the pg_dump command. To restore a non-text file dump, use the pg_restore utility instead:

    $ pg_restore non-plain-text-file

4.6.1.5. Restoring databases dumped using pg_dumpall

To restore data from a database cluster that you dumped using the pg_dumpall utility, follow the steps below.

Conditions préalables

  • You must run the commands as the postgres superuser or a user with database administrator privileges.

Procédure

  1. Ensure that all users who own objects or were granted permissions on objects in the dumped databases already exist. If such users do not exist, the restore fails to recreate the objects with the original ownership and permissions.
  2. Run the psql utility to restore a text file dump created by the pg_dumpall utility:

    $ psql < dumpfile

    where dumpfile is the output of the pg_dumpall command.

4.6.1.6. Performing an SQL dump of a database on another server

Dumping a database directly from one server to another is possible because pg_dump and psql can write to and read from pipes.

Procédure

  • To dump a database from one server to another, run:

    $ pg_dump -h host1 dbname | psql -h host2 dbname

4.6.1.7. Handling SQL errors during restore

By default, psql continues to execute if an SQL error occurs, causing the database to restore only partially.

To change the default behavior, use one of the following approaches when restoring a dump.

Conditions préalables

  • You must run the commands as the postgres superuser or a user with database administrator privileges.

Procédure

  • Make psql exit with an exit status of 3 if an SQL error occurs by setting the ON_ERROR_STOP variable:

    $ psql --set ON_ERROR_STOP=on dbname < dumpfile

  • Specify that the whole dump is restored as a single transaction so that the restore is either fully completed or canceled.

    • When restoring a text file dump using the psql utility:

      $ psql -1
    • When restoring a non-text file dump using the pg_restore utility:

      $ pg_restore -e

      Note that when using this approach, even a minor error can cancel a restore operation that has already run for many hours.

4.6.1.8. Ressources supplémentaires

4.6.2. Backing up PostgreSQL data with a file system level backup

To create a file system level backup, copy PostgreSQL database files to another location. For example, you can use any of the following approaches:

  • Create an archive file using the tar utility.
  • Copy the files to a different location using the rsync utility.
  • Create a consistent snapshot of the data directory.

4.6.2.1. Advantages and limitations of file system backing up

File system level backing up has the following advantage compared to other PostgreSQL backup methods:

  • File system level backing up is usually faster than an SQL dump.

File system level backing up has the following limitations compared to other PostgreSQL backup methods:

  • This backing up method is not suitable when you want to upgrade from RHEL 8 to RHEL 9 and migrate your data to the upgraded system. File system level backup is specific to an architecture and a RHEL major version. You can restore your data on your RHEL 8 system if the upgrade is not successful but you cannot restore the data on a RHEL 9 system.
  • The database server must be shut down before backing up and restoring data.
  • Backing up and restoring certain individual files or tables is impossible. Backing up a file system works only for complete backing up and restoring of an entire database cluster.

4.6.2.2. Performing file system level backing up

To perform file system level backing up, use the following procedure.

Procédure

  1. Choose the location of a database cluster and initialize this cluster:

    # postgresql-setup --initdb
  2. Stop the postgresql service:

    # systemctl stop postgresql.service
  3. Use any method to create a file system backup, for example a tar archive:

    $ tar -cf backup.tar /var/lib/pgsql/data
  4. Start the postgresql service:

    # systemctl start postgresql.service

4.6.3. Backing up PostgreSQL data by continuous archiving

4.6.3.1. Introduction to continuous archiving

PostgreSQL records every change made to the database’s data files into a write ahead log (WAL) file that is available in the pg_wal/ subdirectory of the cluster’s data directory. This log is intended primarily for a crash recovery. After a crash, the log entries made since the last checkpoint can be used for restoring the database to a consistency.

The continuous archiving method, also known as an online backup, combines the WAL files with a copy of the database cluster in the form of a base backup performed on a running server or a file system level backup.

If a database recovery is needed, you can restore the database from the copy of the database cluster and then replay log from the backed up WAL files to bring the system to the current state.

With the continuous archiving method, you must keep a continuous sequence of all archived WAL files that extends at minimum back to the start time of your last base backup. Therefore the ideal frequency of base backups depends on:

  • The storage volume available for archived WAL files.
  • The maximum possible duration of data recovery in situations when recovery is necessary. In cases with a long period since the last backup, the system replays more WAL segments, and the recovery therefore takes more time.
Note

You cannot use pg_dump and pg_dumpall SQL dumps as a part of a continuous archiving backup solution. SQL dumps produce logical backups and do not contain enough information to be used by a WAL replay.

To perform a database backup and restore using the continuous archiving method, follow these instructions:

  1. Set up and test your procedure for archiving WAL files - see WAL archiving.
  2. Perform a base backup - see base backup.

To restore your data, follow instructions in Restoring database with continuous archiving.

4.6.3.2. Advantages and disadvantages of continuous archiving

Continuous archiving has the following advantages compared to other PostgreSQL backup methods:

  • With the continuous backup method, it is possible to use a base backup that is not entirely consistent because any internal inconsistency in the backup is corrected by the log replay. Therefore you can perform a base backup on a running PostgreSQL server.
  • A file system snapshot is not needed; tar or a similar archiving utility is sufficient.
  • Continuous backup can be achieved by continuing to archive the WAL files because the sequence of WAL files for the log replay can be indefinitely long. This is particularly valuable for large databases.
  • Continuous backup supports point-in-time recovery. It is not necessary to replay the WAL entries to the end. The replay can be stopped at any point and the database can be restored to its state at any time since the base backup was taken.
  • If the series of WAL files are continuously available to another machine that has been loaded with the same base backup file, it is possible to restore the other machine with a nearly-current copy of the database at any point.

Continuous archiving has the following disadvantages compared to other PostgreSQL backup methods:

  • Continuous backup method supports only restoration of an entire database cluster, not a subset.
  • Continuous backup requires extensive archival storage.

4.6.3.3. Setting up WAL archiving

A running PostgreSQL server produces a sequence of write ahead log (WAL) records. The server physically divides this sequence into WAL segment files, which are given numeric names that reflect their position in the WAL sequence. Without WAL archiving, the segment files are reused and renamed to higher segment numbers.

When archiving WAL data, the contents of each segment file are captured and saved at a new location before the segment file is reused. You have multiple options where to save the content, such as an NFS-mounted directory on another machine, a tape drive, or a CD.

Note that WAL records do not include changes to configuration files.

To enable WAL archiving, use the following procedure.

Procédure

  1. In the /var/lib/pgsql/data/postgresql.conf file:

    1. Set the wal_level configuration parameter to replica or higher.
    2. Set the archive_mode parameter to on.
    3. Specify the shell command in the archive_command configuration parameter. You can use the cp command, another command, or a shell script.
  2. Restart the postgresql service to enable the changes:

    # systemctl restart postgresql.service
  3. Test your archive command and ensure it does not overwrite an existing file and that it returns a non-zero exit status if it fails.
  4. To protect your data, ensure that the segment files are archived into a directory that does not have group or world read access.
Note

The archive command is executed only on completed WAL segments. A server that generates little WAL traffic can have a substantial delay between the completion of a transaction and its safe recording in archive storage. To limit how old unarchived data can be, you can:

  • Set the archive_timeout parameter to force the server to switch to a new WAL segment file with a given frequency.
  • Use the pg_switch_wal parameter to force a segment switch to ensure that a transaction is archived immediately after it finishes.

Exemple 4.5. Shell command for archiving WAL segments

This example shows a simple shell command you can set in the archive_command configuration parameter.

The following command copies a completed segment file to the required location:

archive_command = 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f'

where the %p parameter is replaced by the relative path to the file to archive and the %f parameter is replaced by the file name.

This command copies archivable WAL segments to the /mnt/server/archivedir/ directory. After replacing the %p and %f parameters, the executed command looks as follows:

test ! -f /mnt/server/archivedir/00000001000000A900000065 && cp pg_wal/00000001000000A900000065 /mnt/server/archivedir/00000001000000A900000065

A similar command is generated for each new file that is archived.

Ressources supplémentaires

4.6.3.4. Making a base backup

You can create a base backup in several ways. This section describes the simplest way of performing a base backup using the pg_basebackup utility on a running PostgreSQL server.

The base backup process creates a backup history file that is stored into the WAL archive area and is named after the first WAL segment file that you need for the base backup.

The backup history file is a small text file containing the starting and ending times, and WAL segments of the backup. If you used the label string to identify the associated dump file, you can use the backup history file to determine which dump file to restore.

Note

Consider keeping several backup sets to be certain that you can recover your data.

To perform a base backup, use the following procedure.

Conditions préalables

  • You must run the commands as the postgres superuser, a user with database administrator privileges, or another user with at least REPLICATION permissions.
  • You must keep all the WAL segment files generated during and after the base backup.

Procédure

  1. Use the pg_basebackup utility to perform the base backup.

    • To create a base backup as individual files (plain format):

      $ pg_basebackup -D backup_directory -Fp

      Replace backup_directory with your desired backup location.

      If you use tablespaces and perform the base backup on the same host as the server, you must also use the --tablespace-mapping option, otherwise the backup will fail upon an attempt to write the backup to the same location.

    • To create a base backup as a tar archive (tar and compressed format):

      $ pg_basebackup -D backup_directory -Ft -z

      Replace backup_directory with your desired backup location.

      To restore such data, you must manually extract the files in the correct locations.

  2. After the base backup process is complete, safely archive the copy of the database cluster and the WAL segment files used during the backup, which are specified in the backup history file.
  3. Delete WAL segments numerically lower than the WAL segment files used in the base backup because these are older than the base backup and no longer needed for a restore.

To specify which database server pg_basebackup will contact, use the following command-line options:

  • The -h option to define the host.

    The default host is either the local host or a host specified by the PGHOST environment variable.

  • The -p option to define the port.

    The default port is indicated by the PGPORT environment variable or the compiled-in default.

4.6.3.5. Restoring the database using a continuous archive backup

To restore a database using a continuous backup, use the following procedure.

Procédure

  1. Stop the server:

    # systemctl stop postgresql.service
  2. Copy the necessary data to a temporary location.

    Preferably, copy the whole cluster data directory and any tablespaces. Note that this requires enough free space on your system to hold two copies of your existing database.

    If you do not have enough space, save the contents of the cluster’s pg_wal directory, which can contain logs that were not archived before the system went down.

  3. Remove all existing files and subdirectories under the cluster data directory and under the root directories of any tablespaces you are using.
  4. Restore the database files from your base backup.

    Ensure that:

    • The files are restored with the correct ownership (the database system user, not root).
    • The files are restored with the correct permissions.
    • The symbolic links in the pg_tblspc/ subdirectory are restored correctly.
  5. Remove any files present in the pg_wal/ subdirectory.

    These files resulted from the base backup and are therefore obsolete. If you did not archive pg_wal/, recreate it with proper permissions.

  6. Copy any unarchived WAL segment files that you saved in step 2 into pg_wal/.
  7. Create the recovery.conf recovery command file in the cluster data directory and specify the shell command in the restore_command configuration parameter. You can use the cp command, another command, or a shell script. For example:

    restore_command = 'cp /mnt/server/archivedir/%f "%p"'
  8. Start the server:

    # systemctl start postgresql.service

    The server will enter the recovery mode and proceed to read through the archived WAL files that it needs.

    If the recovery is terminated due to an external error, the server can be restarted and it will continue the recovery. When the recovery process is completed, the server renames recovery.conf to recovery.done. This prevents the server from accidental re-entering the recovery mode after it starts normal database operations.

  9. Check the contents of the database to verify that the database has recovered into the required state.

    If the database has not recovered into the required state, return to step 1. If the database has recovered into the required state, allow the users to connect by restoring the client authentication configuration in the pg_hba.conf file.

For more information about restoring using the continuous backup, see PostgreSQL Documentation.

4.6.3.6. Ressources supplémentaires

Red Hat logoGithubRedditYoutubeTwitter

Apprendre

Essayez, achetez et vendez

Communautés

À propos de la documentation Red Hat

Nous aidons les utilisateurs de Red Hat à innover et à atteindre leurs objectifs grâce à nos produits et services avec un contenu auquel ils peuvent faire confiance.

Rendre l’open source plus inclusif

Red Hat s'engage à remplacer le langage problématique dans notre code, notre documentation et nos propriétés Web. Pour plus de détails, consultez leBlog Red Hat.

À propos de Red Hat

Nous proposons des solutions renforcées qui facilitent le travail des entreprises sur plusieurs plates-formes et environnements, du centre de données central à la périphérie du réseau.

© 2024 Red Hat, Inc.