Dieser Inhalt ist in der von Ihnen ausgewählten Sprache nicht verfügbar.
4.5.2. Named Analyzers
The Query Module uses analyzer definitions to deal with the complexity of the Analyzer function. Analyzer definitions are reusable by multiple
@Analyzer declarations and includes the following:
- a name: the unique string used to refer to the definition.
- a list of
CharFilters: eachCharFilteris responsible to pre-process input characters before the tokenization.CharFilterscan add, change, or remove characters. One common usage is for character normalization. - a
Tokenizer: responsible for tokenizing the input stream into individual words. - a list of filters: each filter is responsible to remove, modify, or sometimes add words into the stream provided by the
Tokenizer.
The
Analyzer separates these components into multiple tasks, allowing individual components to be reused and components to be built with flexibility using the following procedure:
Procedure 4.1. The Analyzer Process
- The
CharFiltersprocess the character input. Tokenizerconverts the character input into tokens.- The tokens are the processed by the
TokenFilters.
The Lucene-based Query API supports this infrastructure by utilizing the Solr analyzer framework.