2.4. Optimizing the Metadata Cache
Overview
Proper configuration of the metadata cache is one of the key factors affecting the performance of the KahaDB message store. In a production deployment, therefore, you should always take the time to tune the properties of the metadata cache for maximum performance. Figure 2.5, “Overview of the Metadata Cache and Store” shows an overview of the metadata cache and how it interacts with the metadata store. The most important part of the metadata is the B-tree index, which is shown as a tree of nodes in the figure. The data in the cache is periodically synchronized with the metadata store, when a checkpoint is performed.
Figure 2.5. Overview of the Metadata Cache and Store
Synchronizing with the metadata store
The metadata in the cache is constantly changing, in response to the events occurring in the broker. It is therefore necessary to write the metadata cache to disk, from time to time, in order to restore consistency between the metadata cache and the metadata store. There are two distinct mechanisms that can trigger a synchonization between the cache and the store, as follows:
- Batch threshold—as more and more of the B-tree indexes are changed, and thus inconsistent with the metadata store, you can define a threshold for the number of these dirty indexes. When the number of dirty indexes exceeds the threshold, KahaDB writes the cache to the store. The threshold value is set using the
indexWriteBatchSize
property. - Checkpoint interval—irrespective of the current number of dirty indexes, the cache is synchronized with the store at regular time intervals, where the time interval is specified in milliseconds using the
checkpointInterval
property.
In addition, during a normal shutdown, the final state of the cache is synchronized with the store.
Setting the cache size
In the ideal case, the cache should be big enough to hold all of the KahaDB metadata in memory. Otherwise, the cache is forced to swap pages in and out of the persistent metadata store, which causes a considerable drag on performace.
You can specify the cache size using the
indexCacheSize
property, which specifies the size of the cache in units of pages (where one page is 4 KB by default). Generally, the cache should be as large as possible. You can check the size of your metadata store file, db.data
, to get some idea of how big the cache needs to be.
Setting the write batch size
The
indexWriteBatchSize
defines the threshold for the number of dirty indexes that are allowed to accumulate, before KahaDB writes the cache to the store. Normally, these batch writes occur between checkpoints.
If you want to maximize performance, however, you could suppress the batch writes by setting
indexWriteBatchSize
to a very large number. In this case, the store would be updated only during checkpoints. The tradeoff here is that there is a risk of losing a relatively large amount of metadata, in the event of a system failure (but the broker should be able to restore the lost metadata when it restarts, by reading the tail of the journal).