此内容没有您所选择的语言版本。

Chapter 1. Overview

The OpenStack Data Processing service (sahara) provides a robust interface to easily provision and scale Hadoop clusters. Such clusters can then be used to run resource-intensive jobs, typically for processing large data sets. As an OpenStack component, OpenStack Data Processing is fully integrated into the OpenStack ecosystem; for example, users can administer the entire Hadoop data processing workflow through the OpenStack dashboard — from configuring clusters, all the way to launching and running jobs on them.

For more information about Hadoop, see http://hadoop.apache.org/.

Note

The Data Processing service (sahara) is deprecated in OpenStack Platform version 15 and is targeted for removal in version 16.

OpenStack Data Processing uses different plug-ins for provisioning specific clusters of each Hadoop distribution. The Hadoop parameters available for configuring clusters differ depending on the Hadoop distribution (and, by extension, the plug-in used). As of this release (Red Hat OpenStack Platform 15), OpenStack Data Processing supports the following plug-ins:

Cloudera (CDH)
- version 5.7.x (default is 5.7.0)
- version 5.9.x (default is 5.9.0)
- version 5.11.x (default is 5.11.0)
- version 5.13.x (default is 5.13.0)
Hortonworks Data Platform
- version 2.3
- version 2.4
MapR 5.2

OpenStack Data Processing also includes a Guides tab. This tab features wizards that will help you create the templates necessary in order to launch clusters and run jobs on them. However, you will still need to register the components necessary for using OpenStack Data Processing, such as Hadoop images and job binaries. As such, if you intend to use the Guides feature, we recommend you read Chapter 5, Register the Required Components first.

此内容没有您所选择的语言版本。

Chapter 1. Overview

学习

尝试、购买和销售

社区

關於紅帽

让开源更具包容性

关于红帽文档

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links