369.5. Samples
369.5.1. Read + Filter + Write
第一个示例演示了如何使用文件组件读取 CSV 文件,然后将其传递到 Weka。在 Weka 中,我们对数据集应用几个过滤器,然后将它传递给文件组件以进行编写。
@Override public void configure() throws Exception { // Use the file component to read the CSV file from("file:src/test/resources/data?fileName=sfny.csv") // Convert the 'in_sf' attribute to nominal .to("weka:filter?apply=NumericToNominal -R first") // Move the 'in_sf' attribute to the end .to("weka:filter?apply=Reorder -R 2-last,1") // Rename the relation .to("weka:filter?apply=RenameRelation -modify sfny") // Use the file component to write the Arff file .to("file:target/data?fileName=sfny.arff") }
在这里,我们在不使用文件组件的情况下执行以上操作。
@Override public void configure() throws Exception { // Initiate the route from somewhere .from("...") // Use Weka to read the CSV file .to("weka:read?path=src/test/resources/data/sfny.csv") // Convert the 'in_sf' attribute to nominal .to("weka:filter?apply=NumericToNominal -R first") // Move the 'in_sf' attribute to the end .to("weka:filter?apply=Reorder -R 2-last,1") // Rename the relation .to("weka:filter?apply=RenameRelation -modify sfny") // Use Weka to write the Arff file .to("weka:write?path=target/data/sfny.arff"); }
在本例中,客户端会提供输入路径或其它支持的类型。查看用于一组支持的输入类型的 WekaTypeConverters
。
@Override public void configure() throws Exception { // Initiate the route from somewhere .from("...") // Convert the 'in_sf' attribute to nominal .to("weka:filter?apply=NumericToNominal -R first") // Move the 'in_sf' attribute to the end .to("weka:filter?apply=Reorder -R 2-last,1") // Rename the relation .to("weka:filter?apply=RenameRelation -modify sfny") // Use Weka to write the Arff file .to("weka:write?path=target/data/sfny.arff"); }
369.5.2. 构建模型
在构建模型时,我们首先选择要使用的分类算法,然后将其与某些数据进行培训。结果是经过培训的模式,稍后我们用来对不可预见的数据进行分类。
在这里,我们培训了 10 倍的跨验证方式的 J48。
try (CamelContext camelctx = new DefaultCamelContext()) { camelctx.addRoutes(new RouteBuilder() { @Override public void configure() throws Exception { // Use the file component to read the training data from("file:src/test/resources/data?fileName=sfny-train.arff") // Build a J48 classifier using cross-validation with 10 folds .to("weka:model?build=J48&xval=true&folds=10&seed=1") // Persist the J48 model .to("weka:model?saveTo=src/test/resources/data/sfny-j48.model") } }); camelctx.start(); }
369.5.3. 预测类
在这里,我们使用
处理器访问不是直接从端点 URI 提供的功能。
如果您直接出现此语法,则可能需要查看有关 Nessus API 概念 的部分。
try (CamelContext camelctx = new DefaultCamelContext()) { camelctx.addRoutes(new RouteBuilder() { @Override public void configure() throws Exception { // Use the file component to read the test data from("file:src/test/resources/data?fileName=sfny-test.arff") // Remove the class attribute .to("weka:filter?apply=Remove -R last") // Add the 'prediction' placeholder attribute .to("weka:filter?apply=Add -N predicted -T NOM -L 0,1") // Rename the relation .to("weka:filter?apply=RenameRelation -modify sfny-predicted") // Load an already existing model .to("weka:model?loadFrom=src/test/resources/data/sfny-j48.model") // Use a processor to do the prediction .process(new Processor() { public void process(Exchange exchange) throws Exception { Dataset dataset = exchange.getMessage().getBody(Dataset.class); dataset.applyToInstances(new NominalPredictor()); } }) // Write the data file .to("weka:write?path=src/test/resources/data/sfny-predicted.arff") } }); camelctx.start(); }