21.4. Running Hadoop Jobs Across Multiple Red Hat Storage Volumes

download PDF
If you are already running Hadoop Jobs on a volume and wish to enable Hadoop on existing additional Red Hat Storage Volumes, then you must follow the steps in the Enabling Existing Volumes for use with Hadoop section in Deploying the Hortonworks Data Platform on Red Hat Storage chapter, in the Red Hat Storage 3 Installation Guide . If you do not have an additional volume and wish to add one, then you must first complete the procedures mentioned in the Creating volumes for use with Hadoop section and then the procedures mentioned in Enabling Existing Volumes for use with Hadoop section. This will configure the additional volume for use with Hadoop.
Specifying volume specific paths when running Hadoop Jobs

When you specify paths in a Hadoop Job, the full URI of the path is required. For example, if you have a volume named VolumeOne and that must pass in a file called myinput.txt in a directory named input, then you would specify it as glusterfs://VolumeOne/input/myinput.txt, the same formatting goes for the output. The example below shows data read from a path on VolumeOne and written to a path on VolumeTwo.

# bin/hadoop jar /opt/HadoopJobs.jar ProcessLogs glusterfs://VolumeOne/input/myinput.txt glusterfs://VolumeTwo/output/


The very first Red Hat Storage volume that is configured for using with Hadoop is the Default Volume. This is usually the volume name you specified when you went through the Installation Guide. The Default Volume is the only volume that does not require a full URI to be specified and is allowed to use a relative path. Thus, assuming your default volume is called HadoopVol, both glusterfs://HadoopVol/input/myinput.txt and /input/myinput.txt are processed the same when providing input to a Hadoop Job or using the Hadoop CLI.
Red Hat logoGithubRedditYoutubeTwitter


Try, buy, & sell


About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

© 2024 Red Hat, Inc.