Chapter 1. Streams for Apache Kafka Proxy overview
Streams for Apache Kafka Proxy is an Apache Kafka protocol-aware proxy designed to enhance Kafka-based systems. Through its filter mechanism it allows additional behavior to be introduced into a Kafka-based system without requiring changes to either your applications or the Kafka cluster itself.
Functioning as an intermediary, the Streams for Apache Kafka Proxy mediates communication between a Kafka cluster and its clients. It takes on the responsibility of receiving, filtering, and forwarding messages.
1.1. Record Encryption filter Copy linkLink copied to clipboard!
Streams for Apache Kafka Proxy’s Record Encryption filter enhances the security of Kafka messages. The filter uses industry-standard cryptographic techniques to apply encryption to Kafka messages, ensuring the confidentiality of data stored in the Kafka Cluster. Streams for Apache Kafka Proxy centralizes topic-level encryption, ensuring streamlined encryption across Kafka clusters.
The filter uses envelope encryption to encrypt the records with symmetric encryption keys.
- Envelope encryption
- Envelope encryption is an industry-standard technique suited for encrypting large volumes of data in an efficient manner. Data is encrypted with a Data Encryption Key (DEK). The DEK is encrypted using a Key Encryption Key (KEK). The KEK is stored securely in a Key Management System (KMS).
- Symmetric encryption keys
- AES(GCM) 256 bit encryption symmetric encryption keys are used to encrypt and decrypt record data.
The process is as follows:
- The filter intercepts produce requests from producing applications and encrypts the records.
- The produce request is forwarded to the broker.
- The filter intercepts fetch responses from consuming applications and decrypts the records.
- The fetch response is forwarded to the consuming application.
The filter encrypts the record value only. Record keys, headers, and timestamps are not encrypted.
The entire process is transparent from the point of view of Kafka clients and Kafka brokers. Neither are aware that the records are being encrypted, nor do they have any access to the encryption keys or have any influence on the ciphering process to secure the records.
The filter integrates with a Key Management Service (KMS), which has ultimate responsibility for the safe storage of key material. Currently, the filter integrates with HashiCorp Vault as its KMS, though further supported KMS integrations are planned.
1.1.1. How the filter encrypts records Copy linkLink copied to clipboard!
The filter encrypts records from produce requests as follows:
- Filter selects a KEK to apply.
- Requests the KMS to generate a DEK for the KEK.
- Uses an encrypted DEK (DEK encrypted with the KEK) to encrypt the record.
- Replaces the original record with a cipher record (encrypted record, encrypted DEK, and metadata).
The filter uses a DEK reuse strategy. Encrypted records are sent to the same topic using the same DEK until a time-out or an encryption limit is reached.
1.1.2. How the filter decrypts records Copy linkLink copied to clipboard!
The filter decrypts records from fetch responses as follows:
- Filter receives a cipher record from the Kafka broker.
- Reverses the process that constructed the cipher record.
- Uses KMS to decrypt the DEK.
- Uses the decrypted DEK to decrypt the encrypted record.
- Replaces the cipher record with a decrypted record.
The filter uses an LRU (least recently used) strategy for decrypted records. Decrypted DEKs are kept in memory to reduce interactions with the KMS.
1.1.3. How the filter uses the KMS Copy linkLink copied to clipboard!
To support the filter, the KMS provides the following:
- A secure repository for storing Key Encryption Keys (KEKs)
- A service for generating and decrypting Data Encryption Keys (DEKs)
KEKs stay within the KMS. The KMS generates a DEK (which is securely generated random data) for a given KEK, then returns the DEK and an encrypted DEK. The encrypted DEK has the same data but encrypted with the KEK. The KMS doesn’t store DEKs; they are stored as part of the cipher record in the broker.
The KMS must be available during runtime. If the KMS is unavailable, production and consumption through the filter become impossible until KMS service is restored. It is recommended to use the KMS in a high availability (HA) configuration.
1.1.4. What part of a record is encrypted? Copy linkLink copied to clipboard!
The record encryption filter encrypts only the values of records, leaving record keys, headers, and timestamps untouched. Null record values, which might represent deletions in compacted topics, are transmitted to the broker unencrypted. This approach ensures that compacted topics function correctly.
1.1.5. Unencrypted topics Copy linkLink copied to clipboard!
You may configure the system so that some topics are encrypted and others are not. This supports scenarios where topics with confidential information are encrypted and Kafka topics with non-sensitive information can be left unencrypted.