Chapter 3. Using PostgreSQL


The PostgreSQL server is an open source robust and highly-extensible database server based on the SQL language. The PostgreSQL server provides an object-relational database system that can manage extensive datasets and a high number of concurrent users. For these reasons, PostgreSQL servers can be used in clusters to manage high amounts of data.

The PostgreSQL server includes features for ensuring data integrity, building fault-tolerant environments and applications. With the PostgreSQL server, you can extend a database with your own data types, custom functions, or code from different programming languages without the need to recompile the database.

Learn how to install and configure PostgreSQL on a RHEL system, how to back up PostgreSQL data, and how to migrate from an earlier PostgreSQL version.

3.1. Installing PostgreSQL

RHEL 10 provides PostgreSQL 16 as the initial version of the Application Stream, which can be installed easily as an RPM package. Additional PostgreSQL versions are provided as alternative versions with a shorter life cycle in minor releases of RHEL 10.

Important

By design, you can install only one version (stream) of the same module and, because of conflicting RPM packages, you cannot install multiple PostgreSQL instances on the same host. As an alternative, you can run the database server services in a container.

Procedure

  1. Install the PostgreSQL server packages:

    # dnf install postgresql-server

    The postgres superuser is created automatically.

  2. Initialize the database cluster:

    # postgresql-setup --initdb

    Store the data in the default /var/lib/pgsql/data directory.

  3. Enable and start the postgresql service:

    # systemctl enable --now postgresql.service

3.2. Creating PostgreSQL users

PostgreSQL users are of the following types:

  • The postgres Linux system user: Use it only to run the PostgreSQL server and client applications, such as pg_dump. Do not use the postgres system user for any interactive work on PostgreSQL administration, such as database creation and user management.
  • A database superuser: The default postgres PostgreSQL superuser is not related to the postgres system user. You can limit access of the postgres superuser in the /var/lib/pgsql/data/pg_hba.conf file, otherwise no other permission limitations exist. You can also create other database superusers.
  • A role with specific database access permissions:

    • A database user: Has a permission to log in by default.
    • A group of users: Enables managing permissions for the group as a whole.

Roles can own database objects (for example, tables and functions) and can assign object privileges to other roles by using SQL commands.

Standard database management privileges include SELECT, INSERT, UPDATE, DELETE, TRUNCATE, REFERENCES, TRIGGER, CREATE, CONNECT, TEMPORARY, EXECUTE, and USAGE.

Role attributes are special privileges, such as LOGIN, SUPERUSER, CREATEDB, and CREATEROLE.

Important

Perform most tasks as a role that is not a superuser. A common practice is to create a role that has the CREATEDB and CREATEROLE privileges and use this role for all routine management of databases and roles.

Prerequisites

  • The PostgreSQL server is installed.
  • The database cluster is initialized.
  • The password_encryption parameter in the /var/lib/pgsql/data/postgresql.conf file is set to scram-sha-256.
  • Entries in the /var/lib/pgsql/data/pg_hba.conf file use the scram-sha-256 hashing algorithm as authentication method.

Procedure

  1. Log in as the postgres system user, or switch to this user:

    # su - postgres
  2. Start the PostgreSQL interactive terminal:

    $ psql
    psql (16.4)
    Type "help" for help.
    
    postgres=#
  3. Optional: Obtain information about the current database connection:

    postgres=# \conninfo
    You are connected to database "postgres" as user "postgres" via socket in "/var/run/postgresql" at port "5432".
  4. Create a user named mydbuser, set a password for it, and assign the CREATEROLE and CREATEDB permissions to the user:

    postgres=# CREATE USER mydbuser WITH PASSWORD '<password>' CREATEROLE CREATEDB;
    CREATE ROLE

    The mydbuser user now can perform routine database management operations: create databases and manage user indexes.

  5. Log out of the interactive terminal by using the \q meta command:

    postgres=# \q

Verification

  1. Log in to the PostgreSQL terminal as mydbuser, specify the hostname, and connect to the default postgres database, which was created during initialization:

    # psql -U mydbuser -h 127.0.0.1 -d postgres
    Password for user mydbuser:
    Type the password.
    psql (16.4)
    Type "help" for help.
    
    postgres=>
  2. Create a database:

    postgres=> CREATE DATABASE <db_name>;
  3. Log out of the session:

    postgres=# \q
  4. Connect to new database as mydbuser:

    # psql -U mydbuser -h 127.0.0.1 -d <db_name>
    Password for user mydbuser:
    psql (16.4)
    Type "help" for help.
    mydatabase=>

3.3. Configuring PostgreSQL

In a PostgreSQL database, all data and configuration files are stored in a single directory called a database cluster. By default, PostgreSQL uses the /var/lib/pgsql/data/ directory.

PostgreSQL configuration consists of the following files:

  • /var/lib/pgsql/data/postgresql.conf - is used for setting the database cluster parameters.
  • /var/lib/pgsql/data/postgresql.auto.conf - holds basic PostgreSQL settings similarly to postgresql.conf. However, this file is under the server control. It is edited by the ALTER SYSTEM queries, and cannot be edited manually.
  • /var/lib/pgsql/data/pg_ident.conf - is used for mapping user identities from external authentication mechanisms into the PostgreSQL user identities.
  • /var/lib/pgsql/data/pg_hba.conf - is used for configuring client authentication for PostgreSQL databases.

Procedure

  1. Edit the respective configuration file.

    Example 3.1. Configuring PostgreSQL database cluster parameters

    Basic settings of the database cluster parameters in the /var/lib/pgsql/data/postgresql.conf file.

    log_connections = yes
    log_destination = 'syslog'
    search_path = '"$user", public'
    shared_buffers = 128MB
    password_encryption = scram-sha-256

    Example 3.2. Setting client authentication in PostgreSQL

    Set client authentication in the /var/lib/pgsql/data/pg_hba.conf file:

    # TYPE    DATABASE       USER        ADDRESS              METHOD
    local     all            all                              trust
    host      postgres       all         192.168.93.0/24      ident
    host      all            all         .example.com         scram-sha-256
  2. Restart the postgresql service so that the changes become effective:

    # systemctl restart postgresql.service

3.4. Configuring TLS encryption on a PostgreSQL server

By default, PostgreSQL uses unencrypted connections. For more secure connections, you can enable Transport Layer Security (TLS) support on the PostgreSQL server and configure your clients to establish encrypted connections.

Prerequisites

  • You created a TLS private key and a certificate authority (CA) issued a server certificate for your PostgreSQL server.
  • The PostgreSQL server is installed.
  • The database cluster is initialized.
  • If FIPS mode is enabled, clients must either support the Extended Master Secret (EMS) extension or use TLS 1.3. TLS 1.2 connections without EMS fail. For more information, see the Red Hat Knowledgebase solution TLS extension "Extended Master Secret" enforced on RHEL 9.2 and later.

Procedure

  1. Store the private key and the server certificate in the /var/lib/pgsql/data/ directory:

    # cp server.{key,crt} /var/lib/pgsql/data/
  2. Set the ownership of the private key and certificate:

    # chown postgres:postgres /var/lib/pgsql/data/server.{key,crt}
  3. Set permissions on the server certificate that enable only the PostgreSQL server to read the file:

    # chmod 0400 /var/lib/pgsql/data/server.key

    Because certificates are part of the communication before a secure connection is established, any client can retrieve them without authentication. Therefore, you do not need to set strict permissions on the server certificate file.

  4. Edit the /var/lib/pgsql/data/postgresql.conf file and make the following changes:

    1. Set the scram-sha-256 hashing algorithm:

      password_encryption = scram-sha-256
    2. Enable TLS encryption:

      ssl = on
  5. Edit the /var/lib/pgsql/data/pg_hba.conf file and update the authentication entries to use TLS encryption and the scram-sha-256 hashing algorithm. For example, change host entries to hostssl to enable TLS encryption, and set the scram-sha-256 hashing algorithm in the last column:

    hostssl    all    all    192.0.2.0/24    scram-sha-256
  6. Restart the postgresql service:

    # systemctl restart postgresql.service

Verification

  • Use the postgres super user to connect to a PostgreSQL server and execute the \conninfo meta command:

    # psql "postgresql://postgres@localhost:5432" -c '\conninfo'
    Password for user postgres:
    You are connected to database "postgres" as user "postgres" on host "192.0.2.1" at port "5432".
    SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)

3.5. Backing up PostgreSQL data with an SQL dump

The SQL dump method is based on generating a dump file with SQL commands. When a dump is uploaded back to the database server, it recreates the database in the same state as it was at the time of the dump.

The SQL dump is ensured by the following PostgreSQL client applications:

  • pg_dump dumps a single database without cluster-wide information about roles or tablespaces
  • pg_dumpall dumps each database in a given cluster and preserves cluster-wide data, such as role and tablespace definitions.

By default, the pg_dump and pg_dumpall commands write their results into the standard output. To store the dump in a file, redirect the output to an SQL file. The resulting SQL file can be either in a text format or in other formats that allow for parallelism and for more detailed control of object restoration.

You can perform the SQL dump from any remote host that has access to the database.

3.5.1. Advantages and disadvantages of an SQL dump

An SQL dump has the following advantages compared to other PostgreSQL backup methods:

  • An SQL dump is the only PostgreSQL backup method that is not server version-specific. The output of the pg_dump utility can be reloaded into later versions of PostgreSQL, which is not possible for file system level backups or continuous archiving.
  • An SQL dump is the only method that works when transferring a database to a different machine architecture, such as going from a 32-bit to a 64-bit server.
  • An SQL dump provides internally consistent dumps. A dump represents a snapshot of the database at the time pg_dump began running.
  • The pg_dump utility does not block other operations on the database when it is running.

A disadvantage of an SQL dump is that it takes more time compared to file system level backup.

3.5.2. Performing an SQL dump by using pg_dump

To dump a single database without cluster-wide information, use the pg_dump utility.

Prerequisites

  • You must have read access to all tables that you want to dump. To dump the entire database, you must run the commands as the postgres superuser or a user with database administrator privileges.

Procedure

  • Dump a database without cluster-wide information:

    $ pg_dump <db_name> > <dump_file>

    To specify which database server pg_dump will contact, use the following command-line options:

    • The -h option to define the host.

      The default host is either the local host or what is specified by the PGHOST environment variable.

    • The -p option to define the port.

      The default port is indicated by the PGPORT environment variable or the compiled-in default.

3.5.3. Performing an SQL dump by using pg_dumpall

To dump each database in a given database cluster and to preserve cluster-wide data, use the pg_dumpall utility.

Prerequisites

  • You must run the commands as the postgres superuser or a user with database administrator privileges.

Procedure

  • Dump all databases in the database cluster and preserve cluster-wide data:

    $ pg_dumpall > <dump_file>

    To specify which database server pg_dumpall will contact, use the following command-line options:

    • The -h option to define the host.

      The default host is either the local host or what is specified by the PGHOST environment variable.

    • The -p option to define the port.

      The default port is indicated by the PGPORT environment variable or the compiled-in default.

    • The -l option to define the default database.

      This option enables you to choose a default database different from the postgres database created automatically during initialization.

3.5.4. Restoring a database dumped by using pg_dump

To restore a database from an SQL dump that you dumped using the pg_dump utility, follow the steps below.

Prerequisites

  • You must run the commands as the postgres superuser or a user with database administrator privileges.

Procedure

  1. Create a new database:

    $ createdb <db_name>
  2. Verify that all users who own objects or were granted permissions on objects in the dumped database already exist. If such users do not exist, the restore fails to recreate the objects with the original ownership and permissions.
  3. Run the psql utility to restore a text file dump created by the pg_dump utility:

    $ psql <db_name> < <dump_file>

    where <dump_file> is the output of the pg_dump command. To restore a non-text file dump, use the pg_restore utility instead:

    $ pg_restore <non-plain_text_file>

3.5.5. Restoring databases dumped by using pg_dumpall

To restore data from a database cluster that you dumped by using the pg_dumpall utility, follow the steps below.

Prerequisites

  • You must run the commands as the postgres superuser or a user with database administrator privileges.

Procedure

  1. Ensure that all users who own objects or were granted permissions on objects in the dumped databases already exist. If such users do not exist, the restore fails to recreate the objects with the original ownership and permissions.
  2. Run the psql utility to restore a text file dump created by the pg_dumpall utility:

    $ psql < <dump_file>

    where <dump_file> is the output of the pg_dumpall command.

3.5.6. Performing an SQL dump of a database on another server

Dumping a database directly from one server to another is possible because pg_dump and psql can write to and read from pipes.

Procedure

  • To dump a database from one server to another, run:

    $ pg_dump -h <host_1> <db_name> | psql -h <host_2> <db_name>

3.5.7. Handling SQL errors during restore

By default, psql continues to execute if an SQL error occurs, causing the database to restore only partially.

To change the default behavior, use one of the following approaches when restoring a dump.

Prerequisites

  • You must run the commands as the postgres superuser or a user with database administrator privileges.

Procedure

  • Make psql exit with an exit status of 3 if an SQL error occurs by setting the ON_ERROR_STOP variable:

    $ psql --set ON_ERROR_STOP=on <db_name> < <dump_file>
  • Specify that the whole dump is restored as a single transaction so that the restore is either fully completed or canceled.

    • When restoring a text file dump by using the psql utility:

      $ psql -1
    • When restoring a non-text file dump by using the pg_restore utility:

      $ pg_restore -e

    Note that when you use this approach, even a minor error can cancel a restore operation that has already run for many hours.

3.6. Backing up PostgreSQL data with a file system level backup

To create a file system level backup, copy PostgreSQL database files to another location. For example, you can use any of the following approaches:

  • Create an archive file by using the tar utility.
  • Copy the files to a different location by using the rsync utility.
  • Create a consistent snapshot of the data directory.

3.6.1. Advantages and limitations of file system backing up

File system level backing up has the following advantage compared to other PostgreSQL backup methods:

  • File system level backing up is usually faster than an SQL dump.

File system level backing up has the following limitations compared to other PostgreSQL backup methods:

  • This backing up method is not suitable when you want to upgrade from RHEL 9 to RHEL 10 and migrate your data to the upgraded system. File system level backup is specific to an architecture and a RHEL major version. You can restore your data on your RHEL 9 system if the upgrade is not successful but you cannot restore the data on a RHEL 10 system.
  • The database server must be shut down before backing up and restoring data.
  • Backing up and restoring certain individual files or tables is impossible. Backing up a file system works only for complete backing up and restoring of an entire database cluster.

3.6.2. Performing file system level backing up

To perform file system level backing up, use the following procedure.

Procedure

  1. Stop the postgresql service:

    # systemctl stop postgresql.service
  2. Use any method to create a file system backup, for example a tar archive:

    $ tar -cf backup.tar /var/lib/pgsql/data/
  3. Start the postgresql service:

    # systemctl start postgresql.service

3.7. Backing up PostgreSQL data by continuous archiving

PostgreSQL records every change made to the database’s data files into a write ahead log (WAL) file that is available in the pg_wal/ subdirectory of the cluster’s data directory. This log is intended primarily for a crash recovery. After a crash, the log entries made since the last checkpoint can be used for restoring the database to a consistency.

The continuous archiving method, also known as an online backup, combines the WAL files with a copy of the database cluster in the form of a base backup performed on a running server or a file system level backup.

If a database recovery is needed, you can restore the database from the copy of the database cluster and then replay log from the backed up WAL files to bring the system to the current state.

With the continuous archiving method, you must keep a continuous sequence of all archived WAL files that extends at minimum back to the start time of your last base backup. Therefore the ideal frequency of base backups depends on:

  • The storage volume available for archived WAL files.
  • The maximum possible duration of data recovery in situations when recovery is necessary. In cases with a long period since the last backup, the system replays more WAL segments, and the recovery therefore takes more time.
Note

You cannot use pg_dump and pg_dumpall SQL dumps as a part of a continuous archiving backup solution. SQL dumps produce logical backups and do not contain enough information to be used by a WAL replay.

3.7.1. Advantages and disadvantages of continuous archiving

Continuous archiving has the following advantages compared to other PostgreSQL backup methods:

  • With the continuous backup method, it is possible to use a base backup that is not entirely consistent because any internal inconsistency in the backup is corrected by the log replay. Therefore you can perform a base backup on a running PostgreSQL server.
  • A file system snapshot is not needed; tar or a similar archiving utility is sufficient.
  • Continuous backup can be achieved by continuing to archive the WAL files because the sequence of WAL files for the log replay can be indefinitely long. This is particularly valuable for large databases.
  • Continuous backup supports point-in-time recovery. It is not necessary to replay the WAL entries to the end. The replay can be stopped at any point and the database can be restored to its state at any time since the base backup was taken.
  • If the series of WAL files are continuously available to another machine that has been loaded with the same base backup file, it is possible to restore the other machine with a nearly-current copy of the database at any point.

Continuous archiving has the following disadvantages compared to other PostgreSQL backup methods:

  • Continuous backup method supports only restoration of an entire database cluster, not a subset.
  • Continuous backup requires extensive archival storage.

3.7.2. Setting up WAL archiving

A running PostgreSQL server produces a sequence of write ahead log (WAL) records. The server physically divides this sequence into WAL segment files, which are given numeric names that reflect their position in the WAL sequence. Without WAL archiving, the segment files are reused and renamed to higher segment numbers.

When archiving WAL data, the contents of each segment file are captured and saved at a new location before the segment file is reused. You have multiple options where to save the content, such as an NFS-mounted directory on another machine, a tape drive, or a CD.

Note that WAL records do not include changes to configuration files.

To enable WAL archiving, use the following procedure.

Procedure

  1. In the /var/lib/pgsql/data/postgresql.conf file:

    1. Set the wal_level configuration parameter to replica or higher.
    2. Set the archive_mode parameter to on.
    3. Specify the shell command in the archive_command configuration parameter. You can use the cp command, another command, or a shell script.

      Note

      The archive command is executed only on completed WAL segments. A server that generates little WAL traffic can have a substantial delay between the completion of a transaction and its safe recording in archive storage. To limit how old unarchived data can be, you can:

      • Set the archive_timeout parameter to force the server to switch to a new WAL segment file with a given frequency.
      • Use the pg_switch_wal parameter to force a segment switch to ensure that a transaction is archived immediately after it finishes.

      Example 3.3. Shell command for archiving WAL segments

      This example shows a simple shell command you can set in the archive_command configuration parameter.

      The following command copies a completed segment file to the required location:

      archive_command = 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f'

      where the %p parameter is replaced by the relative path to the file to archive and the %f parameter is replaced by the file name.

      This command copies archivable WAL segments to the /mnt/server/archivedir/ directory. After replacing the %p and %f parameters, the executed command looks as follows:

      test ! -f /mnt/server/archivedir/00000001000000A900000065 && cp pg_wal/00000001000000A900000065 /mnt/server/archivedir/00000001000000A900000065

      A similar command is generated for each new file that is archived.

  2. Restart the postgresql service to enable the changes:

    # systemctl restart postgresql.service
  3. Test your archive command and ensure it does not overwrite an existing file and that it returns a nonzero exit status if it fails.
  4. To protect your data, ensure that the segment files are archived into a directory that does not have group or world read access.

Additional resources

3.7.3. Making a base backup

You can create a base backup in several ways. The simplest way of performing a base backup is using the pg_basebackup utility on a running PostgreSQL server.

The base backup process creates a backup history file that is stored into the WAL archive area and is named after the first WAL segment file that you need for the base backup.

The backup history file is a small text file containing the starting and ending times, and WAL segments of the backup. If you used the label string to identify the associated dump file, you can use the backup history file to determine which dump file to restore.

Note

Consider keeping several backup sets to be certain that you can recover your data.

Prerequisites

  • You must run the commands as the postgres superuser, a user with database administrator privileges, or another user with at least REPLICATION permissions.
  • You must keep all the WAL segment files generated during and after the base backup.

Procedure

  1. Use the pg_basebackup utility to perform the base backup.

    • To create a base backup as individual files (plain format):

      $ pg_basebackup -D <backup_directory> -Fp

      Replace backup_directory with your chosen backup location.

      If you use tablespaces and perform the base backup on the same host as the server, you must also use the --tablespace-mapping option, otherwise the backup will fail upon an attempt to write the backup to the same location.

    • To create a base backup as a tar archive (tar and compressed format):

      $ pg_basebackup -D <backup_directory> -Ft -z

      Replace backup_directory with your chosen backup location.

      To restore such data, you must manually extract the files in the correct locations.

    To specify which database server pg_basebackup will contact, use the following command-line options:

    • The -h option to define the host.

      The default host is either the local host or a host specified by the PGHOST environment variable.

    • The -p option to define the port.

      The default port is indicated by the PGPORT environment variable or the compiled-in default.

  2. After the base backup process is complete, safely archive the copy of the database cluster and the WAL segment files used during the backup, which are specified in the backup history file.
  3. Delete WAL segments numerically lower than the WAL segment files used in the base backup because these are older than the base backup and no longer needed for a restore.

3.7.4. Restoring the database by using a continuous archive backup

To restore a database by using a continuous backup, use the following procedure.

Procedure

  1. Stop the server:

    # systemctl stop postgresql.service
  2. Copy the necessary data to a temporary location.

    Preferably, copy the whole cluster data directory and any tablespaces. Note that this requires enough free space on your system to hold two copies of your existing database.

    If you do not have enough space, save the contents of the cluster’s pg_wal directory, which can contain logs that were not archived before the system went down.

  3. Remove all existing files and subdirectories under the cluster data directory and under the root directories of any tablespaces you are using.
  4. Restore the database files from your base backup.

    Ensure that:

    • The files are restored with the correct ownership (the database system user, not root).
    • The files are restored with the correct permissions.
    • The symbolic links in the pg_tblspc/ subdirectory are restored correctly.
  5. Remove any files present in the pg_wal/ subdirectory.

    These files resulted from the base backup and are therefore obsolete. If you did not archive pg_wal/, recreate it with proper permissions.

  6. Copy any unarchived WAL segment files that you saved in step 2 into pg_wal/.
  7. Create the recovery.conf recovery command file in the cluster data directory and specify the shell command in the restore_command configuration parameter. You can use the cp command, another command, or a shell script. For example:

    restore_command = 'cp /mnt/server/archivedir/%f "%p"'
  8. Start the server:

    # systemctl start postgresql.service

    The server will enter the recovery mode and proceed to read through the archived WAL files that it needs.

    If the recovery is terminated due to an external error, the server can be restarted and it will continue the recovery. When the recovery process is completed, the server renames recovery.conf to recovery.done. This prevents the server from accidental re-entering the recovery mode after it starts normal database operations.

  9. Check the contents of the database to verify that the database has recovered into the required state.

    If the database has not recovered into the required state, return to step 1. If the database has recovered into the required state, allow the users to connect by restoring the client authentication configuration in the pg_hba.conf file.

3.7.4.1. Additional resources

Red Hat logoGithubRedditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

© 2024 Red Hat, Inc.