4.5. Analysis
In the Query Module, the process of converting text into single terms is called Analysis and is a key feature of the full-text search engine. Lucene uses
Analyzer
s to control this process.
4.5.1. Default Analyzer and Analyzer by Class
The default analyzer class is used to index tokenized fields, and is configurable through the
default.analyzer
property. The default value for this property is org.apache.lucene.analysis.standard.StandardAnalyzer
.
The analyzer class can be defined per entity, property, and per
@Field
, which is useful when multiple fields are indexed from a single property.
In the following example,
EntityAnalyzer
is used to index all tokenized properties, such as name except, summary and body, which are indexed with PropertyAnalyzer
and FieldAnalyzer
respectively.
Example 4.9. Different ways of using @Analyzer
@Indexed @Analyzer(impl = EntityAnalyzer.class) public class MyEntity { @Field private String name; @Field @Analyzer(impl = PropertyAnalyzer.class) private String summary; @Field(analyzer = @Analyzer(impl = FieldAnalyzer.class) private String body; ... }
Note
Avoid using different analyzers on a single entity. Doing so can create complications in building queries, and make results less predictable, particularly if using a
QueryParser
. Use the same analyzer for indexing and querying on any field.