21.4. Running Hadoop Jobs Across Multiple Red Hat Storage Volumes
When you specify paths in a Hadoop Job, the full URI of the path is required. For example, if you have a volume named VolumeOne
and that must pass in a file called myinput.txt
in a directory named input
, then you would specify it as glusterfs://VolumeOne/input/myinput.txt
, the same formatting goes for the output. The example below shows data read from a path on VolumeOne and written to a path on VolumeTwo.
# bin/hadoop jar /opt/HadoopJobs.jar ProcessLogs glusterfs://VolumeOne/input/myinput.txt glusterfs://VolumeTwo/output/
Note
glusterfs://HadoopVol/input/myinput.txt
and /input/myinput.txt
are processed the same when providing input to a Hadoop Job or using the Hadoop CLI.