Chapter 10. Data privacy for Projects
OpenStack is designed to support multi-tenancy between projects with different data requirements. A cloud operator will need to consider their applicable data privacy concerns and regulations. This chapter addresses aspects of data residency and disposal for OpenStack deployments.
10.1. Data privacy concerns
This section describes some concerns that can arise for data privacy in OpenStack deployments.
10.1.1. Data residency
The privacy and isolation of data has consistently been cited as the primary barrier to cloud adoption over the past few years. Concerns over who owns data in the cloud and whether the cloud operator can be ultimately trusted as a custodian of this data have been significant issues in the past.
Certain OpenStack services have access to data and metadata belonging to projects or reference project information. For example, project data stored in an OpenStack cloud might include the following items:
- Object Storage objects.
- Compute instance ephemeral filesystem storage.
- Compute instance memory.
- Block Storage volume data.
- Public keys for Compute access.
- Virtual machine images in the Image service.
- Instance snapshots.
- Data passed to Compute’s configuration-drive extension.
Metadata stored by an OpenStack cloud includes the following items (this list is non-exhaustive):
- Organization name.
- User’s “Real Name”.
- Number or size of running instances, buckets, objects, volumes, and other quota-related items.
- Number of hours running instances or storing data.
- IP addresses of users.
- Internally generated private keys for compute image bundling.
10.1.2. Data disposal
Good practices suggest that the operator must sanitize cloud system media (digital and non-digital) prior to disposal, prior to release out of organization control, or prior to release for reuse. Sanitization methods should implement an appropriate level of strength and integrity given the specific security domain and sensitivity of the information.
The NIST Special Publication 800-53 Revision 4 takes a particular view on this topic:
The sanitization process removes information from the media such that the information cannot be retrieved or reconstructed. Sanitization techniques, including clearing, purging, cryptographic erase, and destruction, prevent the disclosure of information to unauthorized individuals when such media is reused or released for disposal.
Cloud operators should consider the following when developing general data disposal and sanitization guidelines (as per the NIST recommended security controls):
- Track, document and verify media sanitization and disposal actions.
- Test sanitation equipment and procedures to verify proper performance.
- Sanitize portable, removable storage devices prior to connecting such devices to the cloud infrastructure.
- Destroy cloud system media that cannot be sanitized.
As a result, an OpenStack deployment will need to address the following practices (among others):
- Secure data erasure
- Instance memory scrubbing
- Block Storage volume data
- Compute instance ephemeral storage
- Bare metal server sanitization
10.1.3. Data not securely erased
Within OpenStack some data might be deleted, but not securely erased in the context of the NIST standards outlined above. This is generally applicable to most or all of the above-defined metadata and information stored in the database. This might be remediated with database and/or system configuration for auto vacuuming and periodic free-space wiping.
10.1.4. Instance memory scrubbing
Specific to various hypervisors is the treatment of instance memory. This behavior is not defined in Compute, although it is generally expected of hypervisors that they will make a best effort to scrub memory either upon deletion of an instance, upon creation of an instance, or both.
10.1.5. Encrypting cinder volume data
Use of the OpenStack volume encryption feature is highly encouraged. This is discussed below in the Data Encryption section under Volume Encryption. When this feature is used, destruction of data is accomplished by securely deleting the encryption key. The end user can select this feature while creating a volume, but note that an admin must perform a one-time set up of the volume encryption feature first.
If the OpenStack volume encryption feature is not used, then other approaches generally would be more difficult to enable. If a back-end plug-in is being used, there might be independent ways of doing encryption or non-standard overwrite solutions. Plug-ins to OpenStack Block Storage will store data in a variety of ways. Many plug-ins are specific to a vendor or technology, whereas others are more DIY solutions around filesystems (such as LVM or ZFS). Methods for securely destroying data will vary between plug-ins, vendors, and filesystems.
Some back ends (such as ZFS) will support copy-on-write to prevent data exposure. In these cases, reads from unwritten blocks will always return zero. Other back ends (such as LVM) might not natively support this, so the cinder plug-in takes the responsibility to override previously written blocks before handing them to users. It is important to review what assurances your chosen volume back-end provides and to see what remediation might be available for those assurances not provided.
10.1.6. Image service delay delete features
Image Service has a delayed delete feature, which will pend the deletion of an image for a defined time period. Consider disabling this feature if this behavior is a security concern; you can do this by editing glance-api.conf
file and setting the delayed_delete
option to False
.
10.1.7. Compute soft delete features
Compute has a soft-delete feature, which enables an instance that is deleted to be in a soft-delete state for a defined time period. The instance can be restored during this time period. To disable the soft-delete feature, edit the /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf
file and leave the reclaim_instance_interval
option empty.
10.1.8. Ephemeral storage for Compute instances
Note that the OpenStack Ephemeral disk encryption feature provides a means of improving ephemeral storage privacy and isolation, during both active use as well as when the data is to be destroyed. As in the case of encrypted block storage, one can simply delete the encryption key to effectively destroy the data.
Alternate measures to provide data privacy, in the creation and destruction of ephemeral storage, will be somewhat dependent on the chosen hypervisor and the Compute plug-in.
The libvirt driver for compute might maintain ephemeral storage directly on a filesystem, or in LVM. Filesystem storage generally will not overwrite data when it is removed, although there is a guarantee that dirty extents are not provisioned to users.
When using LVM backed ephemeral storage, which is block-based, it is necessary that the Compute software securely erases blocks to prevent information disclosure. There have previously been information disclosure vulnerabilities related to improperly erased ephemeral block storage devices.
Filesystem storage is a more secure solution for ephemeral block storage devices than LVM as dirty extents cannot be provisioned to users. However, it is important to be mindful that user data is not destroyed, so it is suggested to encrypt the backing filesystem.
10.2. Security hardening for bare metal provisioning
For your bare metal provisioning infrastructure, you should consider security hardening the baseboard management controllers (BMC) in general, and IPMI in particular. For example, you might isolate these systems within a provisioning network, configure non-default and strong passwords, and disable unwanted management functions. For more information, you can refer to the vendor’s guidance on security hardening these components.
If possible, consider evaluating Redfish-based BMCs over legacy ones.
10.2.1. Hardware identification
When deploying a server, there might not always have a reliable way to distinguish it from an attacker’s server. This capability might be dependent on the hardware/BMC to some extent, but generally it seems that there is no verifiable means of identification built into servers.
10.3. Data encryption
The option exists for implementers to encrypt project data wherever it is stored on disk or transported over a network, such as the OpenStack volume encryption feature described below. This is above and beyond the general recommendation that users encrypt their own data before sending it to their provider.
The importance of encrypting data on behalf of projects is largely related to the risk assumed by a provider that an attacker could access project data. There might be requirements here in government, as well as requirements per-policy, in private contract, or even in case law in regard to private contracts for public cloud providers. Consider getting a risk assessment and legal advice before choosing project encryption policies.
Per-instance or per-object encryption is preferable over, in descending order, per-project, per-host, and per-cloud aggregations. This recommendation is inverse to the complexity and difficulty of implementation. Presently, in some projects it is difficult or impossible to implement encryption as loosely granular as even per-project. Implementers should give serious consideration to encrypting project data.
Often, data encryption relates positively to the ability to reliably destroy project and per-instance data, simply by throwing away the keys. It should be noted that in doing so, it becomes of great importance to destroy those keys in a reliable and secure manner.
Opportunities to encrypt data for users are present:
- Object Storage objects
- Network data
10.3.1. Volume encryption
A volume encryption feature in OpenStack supports privacy on a per-project basis. The following features are supported:
- Creation and usage of encrypted volume types, initiated through the dashboard or a command line interface
- Enable encryption and select parameters such as encryption algorithm and key size
- Volume data contained within iSCSI packets is encrypted
- Supports encrypted backups if the original volume is encrypted
- Dashboard indication of volume encryption status. Includes indication that a volume is encrypted, and includes the encryption parameters such as algorithm and key size
- Interface with the Key management service
10.3.2. Ephemeral disk encryption
An ephemeral disk encryption feature addresses data privacy. The ephemeral disk is a temporary work space used by the virtual host operating system. Without encryption, sensitive user information could be accessed on this disk, and vestigial information could remain after the disk is unmounted. The following ephemeral disk encryption features are supported:
10.3.2.1. Creation and usage of encrypted LVM ephemeral disks
The Compute service currently only supports encrypting ephemeral disks in the LVM format.
The Compute configuration file (/var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf
) has the following default parameters within the ephemeral_storage_encryption
section:
-
cipher = aes-xts-plain64
- This field sets the cipher and mode used to encrypt ephemeral storage. AES-XTS is recommended by NIST specifically for disk storage, and the name is shorthand for AES encryption using the XTS encryption mode. Available ciphers depend on kernel support. At the command line, typecryptsetup benchmark
to determine the available options (and see benchmark results), or go to /proc/crypto. -
enabled = false
- To use ephemeral disk encryption, set option:enabled = true
-
key_size = 512
- Note that there might be a key size limitation from the back end key manager that could require the use ofkey_size = 256
, which would only provide an AES key size of 128-bits. XTS requires its own “tweak key” in addition to the encryption key AES requires. This is typically expressed as a single large key. In this case, using the 512-bit setting, 256 bits will be used by AES and 256 bits by XTS. (see NIST)
10.3.2.2. Interacting with the key management service
The key management service will support data isolation by providing ephemeral disk encryption keys on a per-project basis. Ephemeral disk encryption is supported by back end key storage for enhanced security. With the Key management service, when an ephemeral disk is no longer needed, simply deleting the key may take the place of overwriting the ephemeral disk storage area.
10.3.3. Object Storage objects
Object Storage (swift) supports the optional encryption of object data at rest on storage nodes. The encryption of object data is intended to mitigate the risk of user’s` data being read if an unauthorized party were to gain physical access to a disk.
Encryption of data at rest is implemented by middleware that may be included in the proxy server WSGI pipeline. The feature is internal to a swift cluster and not exposed through the API. Clients are unaware that data is encrypted by this feature internally to the swift service; internally encrypted data should never be returned to clients through the swift API.
The following data are encrypted while at rest in swift:
-
Object content, for example, the content of an object
PUT
request’s body. -
The entity tag (
ETag
) of objects that have non-zero content. -
All custom user object metadata values. For example, metadata sent using
X-Object-Meta-
prefixed headers withPUT
orPOST
requests.
Any data or metadata not included in the list above is not encrypted, including:
- Account, container, and object names
- Account and container custom user metadata values
- All custom user metadata names
- Object Content-Type values
- Object size
- System metadata
10.3.4. Block Storage performance and back ends
When enabling the operating system, OpenStack Volume Encryption performance can be enhanced by using the hardware acceleration features currently available in both Intel and AMD processors. Both the OpenStack Volume Encryption feature and the OpenStack Ephemeral Disk Encryption feature use dm-crypt
to secure volume data. dm-crypt
is a transparent disk encryption capability in Linux kernel versions 2.6 and later. When using hardware acceleration, the performance impact of both of the encryption features is minimized.
10.3.5. Network data
Project data for Compute nodes could be encrypted over IPsec or other tunnels. This practice is not common or standard in OpenStack, but is an option available to motivated and interested implementers. Likewise, encrypted data remains encrypted as it is transferred over the network.
10.4. Key management
To address the often mentioned concern of project data privacy, there is significant interest within the OpenStack community to make data encryption more ubiquitous. It is relatively easy for an end-user to encrypt their data prior to saving it to the cloud, and this is a viable path for project objects such as media files, database archives among others. In some instances, client-side encryption is used to encrypt data held by the virtualization technologies which requires client interaction, such as presenting keys, to decrypt data for future use.
Barbican can help projects more seamlessly encrypt the data and have it accessible without burdening the user with key management. Providing encryption and key management services as part of OpenStack eases data-at-rest security adoption and can help address customer concerns about privacy or misuse of data.
The volume encryption and ephemeral disk encryption features rely on a key management service (for example, barbican) for the creation and security-hardened storage of keys.