11.17. Consistent time attributes within Replica and Disperse subvolumes
Traditionally, gluster has been using time attributes (ctime, atime, mtime) of files or directories from bricks. The problem with this approach is that it is not consistent across replicas and bricks, which are hosted by different nodes. Applications which depend on such timestamp attributes break as time attributes are not necessarily returned from the same brick of a replica set always.
One way to solve this issue would have been to allow gluster serve the stat structures from the same brick from a replica set and max-time in DHT.However, this still does not avoid the problem completely as there is no way to change ctime at the moment using the system call (lutimes() only allows mtime and atime). That would mean consistent ctime can not be maintained across replica bricks after self-heal, internal xattr updates and rebalance.
Hence, the solution is to store time attributes (ctime, mtime and atime) as an xattr(extended attribute) of the file. The xattr is updated based on the file operations. If a filesystem file operation changes only the mtime and ctime, gluster updates only these attributes in xattr for that file, which is maintained consistently on all backend bricks of a replica set.
11.17.1. Pre-requisites
Time must be synchronized between all client nodes. Red Hat recommends setting up a network time protocol service to keep time synchronized between all client nodes, and avoid inconsistent time attributes. See Network Time Protocol Setup
11.17.2. Enabling and disabling the Consistent Time Feature
The consistent time feature is disabled by default.
To enable the ctime feature for a specified volume, execute the following command:
# gluster volume set VOLNAME ctime on
To disable the ctime feature for a specified volume, execute the following command:
# gluster volume set VOLNAME ctime off
11.17.3. Advantages of Consistent Time Feature
Several applications like tar and elastic search give “file changed as we read it” and “Underlying file changed by an external force” warnings whenever it detects ctime differences if stat is served from different bricks. With consistent time feature enabled, these applications no longer throw the warnings as time attributes are served from extended attributes which are consistent across replica bricks.
11.17.4. Extended Attribute Format
The extended attribute used to store the time attributes is as below.
glusterfs.mdata = “<version – 8bits> <flags – 64bits> <ctime sec – 64bits> <ctime nsec – 64bits> <mtime sec - 64 bits> <mtime nsec-64 bits> <atime sec - 64 bits> <atime nsec - 64 bits>”
Example:
trusted.glusterfs.mdata=0x010000000000000000000000005cefab7b000000002bcb2587000000005cefab7b000000002bcb2587000000005cefab7b000000002b73964d
11.17.5. Upgrade
The older files (created before upgrade, where ctime feature is either not available or enabled) would not have “trusted.glusterfs.mdata” (stores consistent time attributes on all replica set) xattr created. The xattr gets created on first lookup on the file after upgrade or post enablement of this feature. Note that the xattr creation has to be driven from client and not from server to get consistent time attributes.
11.17.6. Limitations
- The access time (atime) updates are not supported. The support can be enabled by setting the “ctime.noatime” option to “off”. But enabling it would cause significant performance drop. The replicated and dispersed volume reads data from one subvolume resulting in the xattr update on that subvolume and triggering self heal for other subvolumes of replica set for each atime update.
- Mounting gluster volume with time attribute options (noatime, realatime) is not supported with this feature.
- This feature does not guarantee consistent time for directories if the hashed sub-volume for the directory is down.
- Directory listing may report inconsistent time information, hence this feature is not supported for workloads relying too much on directory listing or metadata.