Chapter 12. PMML support in Red Hat Decision Manager
Red Hat Decision Manager includes consumer conformance support for the following PMML model types:
- Regression models
- Scorecard models
- Tree models
-
Mining models (with sub-types
modelChain
,selectAll
, andselectFirst
) - Clustering models
For a list of all PMML model types, including those not supported in Red Hat Decision Manager, see the DMG PMML specification.
Red Hat Decision Manager offers two implementations including PMML legacy and PMML trusty.
The PMML legacy implementation is deprecated with Red Hat Decision Manager 7.10.0 and will be replaced by PMML trusty implementation in a future Red Hat Decision Manager release.
Red Hat Decision Manager does not include a built-in PMML model editor, but you can use an XML or PMML-specific authoring tool to create PMML models and then integrate the PMML models in your decision services in Red Hat Decision Manager. You can import PMML files into your project in Business Central (Menu
For more information about including assets such as PMML files with your project packaging and deployment method, see Packaging and deploying an Red Hat Decision Manager project.
You can migrate a PMML service to a Red Hat build of Kogito microservice. For more information about migrating to Red Hat build of Kogito microservices, see Migrating to Red Hat build of Kogito microservices.
12.1. PMML trusty support and naming conventions in Red Hat Decision Manager
When you add a PMML file to a project in Red Hat Decision Manager, multiple assets are generated. The tree and scorecard models are translated to rules, and regression and mining models are translated to Java classes. Each type of PMML model generates a different set of assets, but all PMML model types generate at least the following set of assets:
- A root package whose name is derived from the PMML file name
- In the root package, a Java factory class that is used to instantiate the model
- A subpackage specific to the model whose name is derived from the model name
-
For rule models, two
rule-mapper
classes that are used to instantiate the rule network - For mining models, children model packages and classes are nested in the parent model
Currently, only one model for each PMML file is allowed. Also, extensions are temporarily not supported.
The following are naming conventions for generated PMML packages and classes:
-
The root package name is the name of the original PMML file in lowercase and without space, for example,
sampleregression
. -
The name of the generated
factory
Java class is the PMML file name withFactory
added to it in the formatfileName+"Factory"
and first uppercase letter, for example,SampleRegressionFactory
. -
The subpackage name of a model is the name of the original model in lowercase and without space, for example,
compoundnestedpredicatescorecard
. The names of the generated data classes are determined by the model type:
-
Rules models: A top-level
PMMLRuleMappersImpl
is generated including references toPMMLRuleMapperImpl
classes that are nested in the subpackages. Mining models:
-
The name of the created
segmentation
subpackage is the name of the original model in lowercase, without space, andsegmentation
added to it in the formatmodelName+”segmentation”
, for example,mixedminingsegmentation
. -
In the
segmentation
subpackage, asegmentation
Java class is created that contains the references to the nested models. The name of the createdsegmentation
Java class is the model name withSegmentation
added to it in the formatmodelName+Segmentation
, for example,MixedMiningSegmentation
. -
For each segment, a specific subpackage is created. The name of the segment specific subpackage is the original model name in lowercase with
segment
and a progressive integer starting from 0 added to it in the formatmodelName+segment+integer
. For example,mixedminingsegment0
,mixedminingsegment1
.
-
The name of the created
-
Rules models: A top-level
Known limitations of PMML trusty implementation
The following list shows elements that are not implemented for PMML trusty:
-
Target
element is not implemented -
Extension
element is not implemented MiningSchema
orMiningField
elements that are not implemented, include:-
importance
-
outliers
-
lowValue
-
highValue
-
invalidValueTreatment
-
invalidValueReplacement
-
OutputField
elements that are not implemented, include:- Decisions
- Value
- Rule feature
- Algorithm
-
isMultiValued
-
segmentId
-
isFinalResult
TransformationDictionary
orLocalTransformation
expressions that are not supported, include:-
NormContinuous
-
NormDiscrete
-
MapValues
-
TextIndex
-
Aggregate
-
Lag
-
-
ModelStats
,ModelExplanation
, andModelExplanation
element is not implemented in all models including regression, tree, scorecard, and mining -
verification
element is not implemented in tree, scorecard, and mining model -
VariableWeight
element is not implemented in mining model Tree model elements that are not implemented, include:
-
IsMissing
orIsNotMissing
-
Surrogate
inCompoundPredicate
-
missingValuePenalty
-
splitCharacteristic
-
isScorable
-
12.2. PMML legacy support and naming conventions in Red Hat Decision Manager
When you add a PMML file to a project in Red Hat Decision Manager, multiple assets are generated. Each type of PMML model generates a different set of assets, but all PMML model types generate at least the following set of assets:
- A DRL file that contains all of the rules associated with your PMML model
At least two Java classes:
- A data class that is used as the default object type for the model type
-
A
RuleUnit
class that is used to manage data sources and rule execution
If a PMML file has MiningModel
as the root model, multiple instances of each of these files are generated.
The following are naming conventions for generated PMML legacy packages, classes, and rules:
-
If no package name is given in a PMML model file, then the default package name
org.kie.pmml.pmml_4_2
is prefixed to the model name for the generated rules in the format"org.kie.pmml.pmml_4_2"+modelName
. -
The package name for the generated
RuleUnit
Java class is the same as the package name for the generated rules. -
The name of the generated
RuleUnit
Java class is the model name withRuleUnit
added to it in the formatmodelName+"RuleUnit"
. -
Each PMML model has at least one data class that is generated. The package name for these classes is
org.kie.pmml.pmml_4_2.model
. The names of generated data classes are determined by the model type, prefixed with the model name:
-
Regression models: One data class named
modelName+"RegressionData"
-
Scorecard models: One data class named
modelName+"ScoreCardData"
-
Tree models: Two data classes, the first named
modelName+"TreeNode"
and the second namedmodelName+"TreeToken"
-
Mining models: One data class named
modelName+"MiningModelData"
-
Regression models: One data class named
The mining model also generates all of the rules and classes that are within each of its segments.
12.2.1. PMML extensions in Red Hat Decision Manager
The PMML legacy specification supports Extension
elements that extend the content of a PMML model. You can use extensions at almost every level of a PMML model definition, and as the first and last child in the main element of a model for maximum flexibility. For more information about PMML extensions, see the DMG PMML Extension Mechanism.
To optimize PMML integration, Red Hat Decision Manager supports the following additional PMML extensions:
-
modelPackage
: Designates a package name for the generated rules and Java classes. Include this extension in theHeader
section of the PMML model file. -
adapter
: Designates the type of construct (bean
ortrait
) that is used to contain input and output data for rules. Insert this extension in theMiningSchema
orOutput
section (or both) of the PMML model file. -
externalClass
: Used in conjunction with theadapter
extension in defining aMiningField
orOutputField
. This extension contains a class with an attribute name that matches the name of theMiningField
orOutputField
element.