Este contenido no está disponible en el idioma seleccionado.
12.2. Installing the Hadoop FileSystem Plugin for Red Hat Storage
12.2.1. Adding the Hadoop Installer for Red Hat Storage Copiar enlaceEnlace copiado en el portapapeles!
yum install rhs-hadoop rhs-hadoop-install
# yum install rhs-hadoop rhs-hadoop-install
The YARN Master Server is required to FUSE Mount all Red Hat Storage Volumes that is used with Hadoop. It must have the Red Hat Storage Client Channel enabled so that the setup_cluster script can install the Red Hat Storage Client Libraries on it.
- If you have registered your machine using Red Hat Subscription Manager, enable the channel by running the following command:
subscription-manager repos --enable=rhel-6-server-rhs-client-1-rpms
# subscription-manager repos --enable=rhel-6-server-rhs-client-1-rpmsCopy to Clipboard Copied! Toggle word wrap Toggle overflow - If you have registered your machine using Satellite server, enable the channel by running the following command:
rhn-channel --add --channel=rhel-x86_64-server-rhsclient-6
# rhn-channel --add --channel=rhel-x86_64-server-rhsclient-6Copy to Clipboard Copied! Toggle word wrap Toggle overflow
12.2.2. Configuring the Trusted Storage Pool for use with Hadoop Copiar enlaceEnlace copiado en el portapapeles!
Note
- Open the terminal window of the server designated to be the Ambari Management Server and navigate to the
/usr/share/rhs-hadoop-install/directory. - Run the hadoop cluster configuration script as given below:
setup_cluster.sh [-y] [--hadoop-mgmt-node <node>] [--yarn-master <node>] <node-list-spec>
setup_cluster.sh [-y] [--hadoop-mgmt-node <node>] [--yarn-master <node>] <node-list-spec>Copy to Clipboard Copied! Toggle word wrap Toggle overflow where <node-list-spec> is<node1>:<brickmnt1>:<blkdev1> <node2>[:<brickmnt2>][:<blkdev2>] [<node3>[:<brickmnt3>][:<blkdev3>]] ... [<nodeN>[:<brickmntN>][:<blkdevN>]]
<node1>:<brickmnt1>:<blkdev1> <node2>[:<brickmnt2>][:<blkdev2>] [<node3>[:<brickmnt3>][:<blkdev3>]] ... [<nodeN>[:<brickmntN>][:<blkdevN>]]Copy to Clipboard Copied! Toggle word wrap Toggle overflow where- <brickmnt> is the name of the XFS mount for the above <blkdev>,for example,
/mnt/brick1or/external/HadoopBrick. When a Red Hat Storage volume is created its bricks has the volume name appended, so <brickmnt> is a prefix for the volume's bricks. Example: If a new volume is namedHadoopVolthen its brick list would be:<node>:/mnt/brick1/HadoopVolor<node>:/external/HadoopBrick/HadoopVol. - <blkdev> is the name of a Logical Volume device path, for example,
/dev/VG1/LV1or/dev/mapper/VG1-LV1. Since LVM is a prerequisite for Red Hat Storage, the <blkdev> is not expected to be a raw block path, such as/dev/sdb.
Given below is an example of running setup_cluster.sh script on a the YARN Master server and four Red Hat Storage Nodes which has the same logical volume and mount point intended to be used as a Red Hat Storage Brick../setup_cluster.sh --yarn-master yarn.hdp rhs-1.hdp:/mnt/brick1:/dev/rhs_vg1/rhs_lv1 rhs-2.hdp rhs-3.hdp rhs-4.hdp
./setup_cluster.sh --yarn-master yarn.hdp rhs-1.hdp:/mnt/brick1:/dev/rhs_vg1/rhs_lv1 rhs-2.hdp rhs-3.hdp rhs-4.hdpCopy to Clipboard Copied! Toggle word wrap Toggle overflow Note
If a brick mount is omitted, the brick mount of the first node is used and if one block device is omitted, the block device of the first node is used.
12.2.3. Creating Volumes for use with Hadoop Copiar enlaceEnlace copiado en el portapapeles!
Note
hadoop or mapredlocal.
- Open the terminal window of the server designated to be the Ambari Management Server and navigate to the
/usr/share/rhs-hadoop-install/directory. - Run the hadoop cluster configuration script as given below:
create_vol.sh [-y] <volName> <volMountPrefix> <node-list>
create_vol.sh [-y] <volName> <volMountPrefix> <node-list>Copy to Clipboard Copied! Toggle word wrap Toggle overflow where- <node-list> is: <node1>:<brickmnt> <node2>[:<brickmnt2>] <node3>[:<brickmnt3>] ... [<nodeN>[:<brickmntN>
- <brickmnt> is the name of the XFS mount for the block devices used by the above nodes, for example,
/mnt/brick1or/external/HadoopBrick. When a RHS volume is created its bricks will have the volume name appended, so <brickmnt> is a prefix for the volume's bricks. For example, if a new volume is namedHadoopVolthen its brick list would be:<node>:/mnt/brick1/HadoopVolor<node>:/external/HadoopBrick/HadoopVol.
Note
The node-list forcreate_vol.shis similar to thenode-list-specused bysetup_cluster.shexcept that a block device is not specified increate_vol.Given below is an example on how to create a volume named HadoopVol, using 4 Red Hat Storage Servers, each with the same brick mount and mount the volume on/mnt/glusterfs./create_vol.sh HadoopVol /mnt/glusterfs rhs-1.hdp:/mnt/brick1 rhs-2.hdp rhs-3.hdp rhs-4.hdp
./create_vol.sh HadoopVol /mnt/glusterfs rhs-1.hdp:/mnt/brick1 rhs-2.hdp rhs-3.hdp rhs-4.hdpCopy to Clipboard Copied! Toggle word wrap Toggle overflow
12.2.4. Adding the User Directories for the Hadoop Processes on the Red Hat Storage Volume Copiar enlaceEnlace copiado en el portapapeles!
Note
mkdir /mnt/glusterfs/HadoopVol/user/mapred mkdir /mnt/glusterfs/HadoopVol/user/yarn mkdir /mnt/glusterfs/HadoopVol/user/hcat mkdir /mnt/glusterfs/HadoopVol/user/hive mkdir /mnt/glusterfs/HadoopVol/user/ambari-qa
# mkdir /mnt/glusterfs/HadoopVol/user/mapred
# mkdir /mnt/glusterfs/HadoopVol/user/yarn
# mkdir /mnt/glusterfs/HadoopVol/user/hcat
# mkdir /mnt/glusterfs/HadoopVol/user/hive
# mkdir /mnt/glusterfs/HadoopVol/user/ambari-qa
chown ambari-qa:hadoop /mnt/glusterfs/HadoopVol/user/ambari-qa chown hive:hadoop /mnt/glusterfs/HadoopVol/user/hive chown hcat:hadoop /mnt/glusterfs/HadoopVol/user/hcat chown yarn:hadoop /mnt/glusterfs/HadoopVol/user/yarn chown mapred:hadoop /mnt/glusterfs/HadoopVol/user/mapred
# chown ambari-qa:hadoop /mnt/glusterfs/HadoopVol/user/ambari-qa
# chown hive:hadoop /mnt/glusterfs/HadoopVol/user/hive
# chown hcat:hadoop /mnt/glusterfs/HadoopVol/user/hcat
# chown yarn:hadoop /mnt/glusterfs/HadoopVol/user/yarn
# chown mapred:hadoop /mnt/glusterfs/HadoopVol/user/mapred
12.2.5. Deploying and Configuring the HDP 2.0.6 Stack on Red Hat Storage using Ambari Manager Copiar enlaceEnlace copiado en el portapapeles!
Important
HDFS as the storage selection in the HDP 2.0.6.GlusterFS stack is not supported. If you want to deploy HDFS, then you must select the HDP 2.0.6 stack (not HDP 2.0.6.GlusterFS) and follow the instructions of the Hortonworks documentation.
- Launch a web browser and enter
http://hostname:8080in the URL by replacing hostname with the hostname of your Ambari Management Server.Note
If the Ambari Console fails to load in the browser, it is usually because iptables is still running. Stop iptables by opening a terminal window and runservice iptables stopcommand. - Enter
adminandadminfor the username and password. - Assign a name to your cluster, such as
MyCluster. - Select the
HDP 2.0.6.GlusterFS Stack(if not already selected by default) and clickNext. - On the
Install Optionsscreen:- For
Target Hosts, add the YARN server and all the nodes in the trusted storage pool. - Select
Perform manual registrations on hosts and do not use SSHoption. - Accept any warnings you may see and click
Register and Confirmbutton. - Click
OKonBefore you proceed warningwarning. The Ambari Agents have all been installed for you during thesetup_cluster.shscript.
- For
Confirm Hosts, the progress must show as green for all the hosts. ClickNextand ignore theHost Checkwarning. - For
Choose Services, unselect HDFS and as a minimum select GlusterFS, Ganglia, YARN+MapReduce2 and ZooKeeper.Note
- Do not select the Nagios service, as it is not supported. For more information, see subsection 21.1. Deployment Scenarios of chapter 21. Administering the Hortonworks Data Platform on Red Hat Storage in the Red Hat Storage 3.0 Administration Guide.
- The use of HBase has not been extensively tested and is not yet supported.
- This section describes how to deploy HDP on Red Hat Storage. Selecting
HDFSas the storage selection in the HDP 2.0.6.GlusterFS stack is not supported. If users wish to deploy HDFS, then they must select the HDP 2.0.6 stack (not HDP 2.0.6.GlusterFS) and follow the instructions in the Hortonworks documentation.
- For
Assign Masters, set all the services to your designated YARN Master Server. For ZooKeeper, select at least 3 separate nodes within your cluster. - For
Assign Slaves and Clients, select all the nodes asNodeManagersexcept the YARN Master Server. You must also ensure to click theClientcheckbox for each node. - On the
Customize Servicesscreen:- Click YARN tab, scroll down to the yarn.nodemanager.log-dirs and yarn.nodemanager.local-dirs properties and remove any entries that begin with
/mnt/glusterfs/. - Click MapReduce2 tab, scroll down to the
Advancedsection, and modify the following property:Expand Key Value yarn.app.mapreduce.am.staging-dir glusterfs:///user - Click MapReduce2 tab, scroll down to the bottom, and under the custom mapred-site.xml, add the following four custom properties and then click on the
Nextbutton:Expand Key Value mapred.healthChecker.script.path glusterfs:///mapred/jobstatus mapred.job.tracker.history.completed.location glusterfs:///mapred/history/done mapred.system.dir glusterfs:///mapred/system mapreduce.jobtracker.staging.root.dir glusterfs:///user - Review other tabs that are highlighted in red. These require you to enter additional information, such as passwords for the respective services.
- Review your configuration and then click
Deploybutton. Once the deployment is complete, it will state that the deployment is 100% complete and the progress bars will be colored in Orange.Note
The deployment process is susceptible to network and bandwidth issues. If the deployment fails, try clicking "Retry" to attempt the deployment again. This often resolves the issue. - Click
Nextto proceed to the Ambari Dashboard. Select the YARN service on the top left and clickStop-All. Do not clickStart-Alluntil you perform the steps in section Section 12.2.7, “Configuring the Linux Container Executor”.
12.2.6. Enabling Existing Volumes for use with Hadoop Copiar enlaceEnlace copiado en el portapapeles!
Important
create_vol.sh script, you must follow the steps listed in this section as well.
Note
- Open the terminal window of the server designated to be the Ambari Management Server and navigate to the
/usr/share/rhs-hadoop-install/directory. - Run the Hadoop Trusted Storage pool configuration script as given below:
enable_vol.sh [-y] [--hadoop-mgmt-node <node>] [--user <admin-user>] [--pass <admin-password>] [--port <mgmt-port-num>] [--yarn-master <node>] [--rhs-node <storage-node>] <volName>
# enable_vol.sh [-y] [--hadoop-mgmt-node <node>] [--user <admin-user>] [--pass <admin-password>] [--port <mgmt-port-num>] [--yarn-master <node>] [--rhs-node <storage-node>] <volName>Copy to Clipboard Copied! Toggle word wrap Toggle overflow For Example;./enable_vol.sh --yarn-master yarn.hdp --rhs-node rhs-1.hdp HadoopVol
./enable_vol.sh --yarn-master yarn.hdp --rhs-node rhs-1.hdp HadoopVolCopy to Clipboard Copied! Toggle word wrap Toggle overflow Note
If --yarn-master and/or --rhs-node options are omitted then the default of localhost (the node from which the script is being executed) is assumed.--rhs-nodeis the hostname of any of the storage nodes in the trusted storage pool. This is required to access the gluster command. Default is localhost and it must have gluster CLI access.
12.2.7. Configuring the Linux Container Executor Copiar enlaceEnlace copiado en el portapapeles!
container is launched and controlled. The Linux Container Executor sets up restricted permissions and the user/group ownership of local files and directories used by the containers such as the shared objects, jars, intermediate files, log files, and so on. Perform the following steps to configure the Linux Container Executor program:
- In the Ambari console, click Stop All in the Services navigation panel. You must wait until all the services are completely stopped.
- On each server within the Red Hat Storage trusted storage pool:
- Open the terminal and navigate to
/usr/share/rhs-hadoop-install/directory: - Execute the
setup_container_executor.shscript.
- On each server within the Red Hat Storage trusted storage pool and the YARN Master server:
- Open the terminal and navigate to
/etc/hadoop/conf/directory. - Replace the contents of
container-executor.cfgfile with the following:yarn.nodemanager.linux-container-executor.group=hadoop banned.users=yarn min.user.id=1000 allowed.system.users=tom
yarn.nodemanager.linux-container-executor.group=hadoop banned.users=yarn min.user.id=1000 allowed.system.users=tomCopy to Clipboard Copied! Toggle word wrap Toggle overflow Note
Ensure that there is no additional whitespace at the end of each line and at the end of the file. Also,tomis an example user. Hadoop ignores theallowed.system.userparameter, but we recommend having at least one valid user. You can modify this file on one server and then use Secure Copy (or any another approach) to copy the modified file to the same location on each server.