이 콘텐츠는 선택한 언어로 제공되지 않습니다.

Chapter 5. Register the Required Components


OpenStack Data Processing requires a Hadoop image containing the necessary elements to launch and use Hadoop clusters. Specifically, Red Hat OpenStack Platform requires an image containing Red Hat Enterprise Linux with the necessary data processing plug-in.

Once you have a Hadoop image suitable for the jobs you wish to run, register it to the OpenStack Data Processing service. To do so:

  1. Upload the image to the Image service. For instructions on how to do so, see Upload an Image.
  2. After uploading the image, select Project > Data Processing > Image Registry in the dashboard.
  3. Click Register Image, and select the Hadoop image from the Image drop-down menu.
  4. Enter the user name that the OpenStack Data Processing service should use to apply settings and manage processes on each instance/node. The user name set for this purpose on the official images provided by Red Hat Enterprise Linux (which you used in Chapter 4, Create Hadoop Image) is cloud-user.
  5. By default, the OpenStack Data Processing service will add the necessary plug-in and version tags in the plug-in and Version drop-down menu. Verify that the tag selection is correct, then click Add plugin tags to add them. The OpenStack Data Processing service also allows you to use custom tags to either differentiate or group registered images. Use the Add custom tag button to add a tag; tags appear in the box under the Description field.

    To remove a custom tag, click the x beside its name.

  6. Click Done. The image should now appear in the Image Registry table.

5.1. Register Input and Output Data Sources

After registering an image, register your data input source and output destination. You can register both as objects from the Object Storage service; as such, you need to upload both as objects first. For instructions on how to do so, see Upload an Object.

Note

You can also register data objects straight from another Hadoop-compatible distributed file system (for example, HDFS). For information on how to upload data to your chosen distributed file system, see its documentation.

  1. In the dashboard, select Project > Data Processing > Data Sources.
  2. Click Create Data Source. Enter a name for your data source in the Name field.
  3. Use the Description field to describe the data source (optional).
  4. Select your data source’s type and URL. The procedure for doing so depends on your source’s location:

    • If your data is located in the Object Storage service, select Swift from the Data Source Type drop-down menu. Then:

      1. Provide the container and object name of your data source as swift://CONTAINER/OBJECT in the URL field.
      2. If your data source requires a login, supply the necessary credentials in the Source username and Source password fields.
    • If your data is located in a Hadoop Distributed File System (HDFS), select the corresponding source from the Data Source Type drop-down menu. Then, enter the data source’s URL in the URL field as hdfs://HDFSHOST:PORT/OBJECTPATH, where:

      • HDFSHOST is the host name of the HDFS host.
      • PORT is the port on which the data source is accessible.
      • OBJECTPATH is the available path to the data source on HDFSHOST.
    • If your data is located in an S3 object store, select the corresponding source from the Data Source Type drop-down menu. Then, enter the data source URL in the URL field in the format s3://bucket/path/to/object.

      • If you have not already configured the following parameters in the cluster configuration or job execution settings, you must configure them here:

        • S3 access key
        • S3 secret key
        • S3 endpoint is the URL of the S3 service without the protocol.
        • Use SSL which must be a boolean value.
        • Use bucket in path indicates virtual-hosted or path URLs and must be a boolean value.
  5. Click Done. The data source should now be available in the Data Sources table.

Perform this procedure for each data input/output object required for your jobs.

Red Hat logoGithubRedditYoutubeTwitter

자세한 정보

평가판, 구매 및 판매

커뮤니티

Red Hat 문서 정보

Red Hat을 사용하는 고객은 신뢰할 수 있는 콘텐츠가 포함된 제품과 서비스를 통해 혁신하고 목표를 달성할 수 있습니다.

보다 포괄적 수용을 위한 오픈 소스 용어 교체

Red Hat은 코드, 문서, 웹 속성에서 문제가 있는 언어를 교체하기 위해 최선을 다하고 있습니다. 자세한 내용은 다음을 참조하세요.Red Hat 블로그.

Red Hat 소개

Red Hat은 기업이 핵심 데이터 센터에서 네트워크 에지에 이르기까지 플랫폼과 환경 전반에서 더 쉽게 작업할 수 있도록 강화된 솔루션을 제공합니다.

© 2024 Red Hat, Inc.