314.7. Hive 作业
Instead of working with RDDs or DataFrame Spark component can also receive Hive SQL queries as payloads. To send Hive query to Spark component, use the following URI:
spark RDD producer
spark:hive
以下片段演示了如何将消息作为输入发送到作业并返回结果:
调用 spark 作业
long carsCount = template.requestBody("spark:hive?collect=false", "SELECT * FROM cars", Long.class); List<Row> cars = template.requestBody("spark:hive", "SELECT * FROM cars", List.class);
在查询之前,我们想对其执行查询的表应在 HiveContext 中注册。例如,在 Spring such registration 中可以如下所示:
spark RDD 定义
@Bean Dataset<Row> cars(HiveContext hiveContext) { jsonCars = hiveContext.read().json("/var/data/cars.json"); jsonCars.registerTempTable("cars"); return jsonCars; }