此内容没有您所选择的语言版本。

5.3. Filters

Apache Lucene is able to filter query results according to a custom filtering process. This is a powerful way to apply additional data restrictions, especially since filters can be cached and reused. Applicable use cases include:

security
temporal data (example, view only last month's data)
population filter (example, search limited to a given category)
and many more

Report a bug

5.3.1. Defining and Implementing a Filter
复制链接

The Lucene-based Query API includes transparent caches named filters which include parameters. The API is similar to the Hibernate Core filters:

Example 5.21. Enabling Fulltext Filters for a Query

cacheQuery = Search.getSearchManager(cache).getQuery(query, Driver.class);
cacheQuery.enableFullTextFilter("bestDriver");
cacheQuery.enableFullTextFilter("security").setParameter("login", "andre");
cacheQuery.list(); //returns only best drivers where andre has credentials

cacheQuery = Search.getSearchManager(cache).getQuery(query, Driver.class);
cacheQuery.enableFullTextFilter("bestDriver");
cacheQuery.enableFullTextFilter("security").setParameter("login", "andre");
cacheQuery.list(); //returns only best drivers where andre has credentials

Copy to Clipboard

Toggle word wrap

In the provided example, two filters are enabled in the query. Enable or disable filters to customize the query.

Declare filters using the @FullTextFilterDef annotation. This annotation applies to @Indexed entities irrespective of the filter's query. Filter definitions are global therefore each filter must have a unique name. If two @FullTextFilterDef annotations with the same name are defined, a SearchException is thrown. Each named filter must specify its filter implementation.

Example 5.22. Defining and Implementing a Filter

@FullTextFilterDefs({
    @FullTextFilterDef(name = "bestDriver", impl = BestDriversFilter.class), 
    @FullTextFilterDef(name = "security", impl = SecurityFilterFactory.class) 
})
public class Driver { ... }

@FullTextFilterDefs({
    @FullTextFilterDef(name = "bestDriver", impl = BestDriversFilter.class), 
    @FullTextFilterDef(name = "security", impl = SecurityFilterFactory.class) 
})
public class Driver { ... }

Copy to Clipboard

Toggle word wrap

public class BestDriversFilter extends org.apache.lucene.search.Filter {

    public DocIdSet getDocIdSet(IndexReader reader) throws IOException {
        OpenBitSet bitSet = new OpenBitSet(reader.maxDoc());
        TermDocs termDocs = reader.termDocs(new Term("score", "5"));
        while (termDocs.next()) {
            bitSet.set(termDocs.doc());
        }
        return bitSet;
    }
}

public class BestDriversFilter extends org.apache.lucene.search.Filter {

    public DocIdSet getDocIdSet(IndexReader reader) throws IOException {
        OpenBitSet bitSet = new OpenBitSet(reader.maxDoc());
        TermDocs termDocs = reader.termDocs(new Term("score", "5"));
        while (termDocs.next()) {
            bitSet.set(termDocs.doc());
        }
        return bitSet;
    }
}

Copy to Clipboard

Toggle word wrap

BestDriversFilter is a Lucene filter that reduces the result set to drivers where the score is 5. In the example, the filter implements the org.apache.lucene.search.Filter directly and contains a no-arg constructor.

Report a bug

5.3.2. The @Factory Filter
复制链接

Use the following factory pattern if the filter creation requires further steps, or if the filter does not have a no-arg constructor:

Example 5.23. Creating a filter using the factory pattern

@FullTextFilterDef(name = "bestDriver", impl = BestDriversFilterFactory.class)
public class Driver { ... }

public class BestDriversFilterFactory {

    @Factory
    public Filter getFilter() {
        //some additional steps to cache the filter results per IndexReader
        Filter bestDriversFilter = new BestDriversFilter();
        return new CachingWrapperFilter(bestDriversFilter);
    }
}

@FullTextFilterDef(name = "bestDriver", impl = BestDriversFilterFactory.class)
public class Driver { ... }

public class BestDriversFilterFactory {

    @Factory
    public Filter getFilter() {
        //some additional steps to cache the filter results per IndexReader
        Filter bestDriversFilter = new BestDriversFilter();
        return new CachingWrapperFilter(bestDriversFilter);
    }
}

Copy to Clipboard

Toggle word wrap

The Lucene-based Query API uses a @Factory annotated method to build the filter instance. The factory must have a no argument constructor.

Named filters come in handy where parameters have to be passed to the filter. For example a security filter might want to know which security level you want to apply:

Example 5.24. Passing parameters to a defined filter

cacheQuery = Search.getSearchManager(cache).getQuery(query, Driver.class);
cacheQuery.enableFullTextFilter("security").setParameter("level", 5);

cacheQuery = Search.getSearchManager(cache).getQuery(query, Driver.class);
cacheQuery.enableFullTextFilter("security").setParameter("level", 5);

Copy to Clipboard

Toggle word wrap

Each parameter name should have an associated setter on either the filter or filter factory of the targeted named filter definition.

Example 5.25. Using parameters in the actual filter implementation

public class SecurityFilterFactory {
    private Integer level;

    /**
     * injected parameter
     */
    public void setLevel(Integer level) {
        this.level = level;
    }

    @Key 
    public FilterKey getKey() {
        StandardFilterKey key = new StandardFilterKey();
        key.addParameter(level);
        return key;
    }

    @Factory
    public Filter getFilter() {
        Query query = new TermQuery(new Term("level", level.toString()));
        return new CachingWrapperFilter(new QueryWrapperFilter(query));
    }
}

public class SecurityFilterFactory {
    private Integer level;

    /**
     * injected parameter
     */
    public void setLevel(Integer level) {
        this.level = level;
    }

    @Key 
    public FilterKey getKey() {
        StandardFilterKey key = new StandardFilterKey();
        key.addParameter(level);
        return key;
    }

    @Factory
    public Filter getFilter() {
        Query query = new TermQuery(new Term("level", level.toString()));
        return new CachingWrapperFilter(new QueryWrapperFilter(query));
    }
}

Copy to Clipboard

Toggle word wrap

Note the method annotated @Key returns a FilterKey object. The returned object has a special contract: the key object must implement equals() / hashCode() so that two keys are equal if and only if the given Filter types are the same and the set of parameters are the same. In other words, two filter keys are equal if and only if the filters from which the keys are generated can be interchanged. The key object is used as a key in the cache mechanism.

Report a bug

5.3.3. Key Objects
复制链接

@Key methods are needed only if:

the filter caching system is enabled (enabled by default)
the filter has parameters

The StandardFilterKey delegates the equals() / hashCode() implementation to each of the parameters equals and hashcode methods.

The defined filters are per default cached. The cache uses a combination of hard and soft references to allow disposal of memory when needed. The hard reference cache keeps track of the most recently used filters and transforms the ones least used to SoftReferences when needed. Once the limit of the hard reference cache is reached additional filters are cached as SoftReferences. To adjust the size of the hard reference cache, use default.filter.cache_strategy.size (defaults to 128). For advanced use of filter caching, you can implement your own FilterCachingStrategy. The classname is defined by default.filter.cache_strategy.

This filter caching mechanism should not be confused with caching the actual filter results. In Lucene it is common practice to wrap filters using the IndexReader around a CachingWrapperFilter. The wrapper will cache the DocIdSet returned from the getDocIdSet(IndexReader reader) method to avoid expensive recomputation. It is important to mention that the computed DocIdSet is only cachable for the same IndexReader instance, because the reader effectively represents the state of the index at the moment it was opened. The document list cannot change within an opened IndexReader. A different/newIndexReader instance, however, works potentially on a different set of Documents (either from a different index or simply because the index has changed), hence the cached DocIdSet has to be recomputed.

Report a bug

5.3.4. Full Text Filter
复制链接

The Lucene-based Query API uses the cache flag of @FullTextFilterDef, set to FilterCacheModeType.INSTANCE_AND_DOCIDSETRESULTS which automatically caches the filter instance and wraps the filter around a Hibernate specific implementation of CachingWrapperFilter. Unlike Lucene's version of this class, SoftReferences are used with a hard reference count (see discussion about filter cache). The hard reference count is adjusted using default.filter.cache_docidresults.size (defaults to 5). Wrapping is controlled using the @FullTextFilterDef.cache parameter. There are three different values for this parameter:

Expand

Value	Definition
FilterCacheModeType.NONE	No filter instance and no result is cached by the Query Module. For every filter call, a new filter instance is created. This setting addresses rapidly changing data sets or heavily memory constrained environments.
FilterCacheModeType.INSTANCE_ONLY	The filter instance is cached and reused across concurrent `Filter.getDocIdSet()` calls. `DocIdSet` results are not cached. This setting is useful when a filter uses its own specific caching mechanism or the filter results change dynamically due to application specific events making `DocIdSet` caching in both cases unnecessary.
FilterCacheModeType.INSTANCE_AND_DOCIDSETRESULTS	Both the filter instance and the `DocIdSet` results are cached. This is the default value.

Filters should be cached in the following situations:

The system does not update the targeted entity index often (in other words, the IndexReader is reused a lot).
The Filter's DocIdSet is expensive to compute (compared to the time spent to execute the query).

Report a bug

5.3.5. Using Filters in a Sharded Environment
复制链接

Execute queries on a subset of the available shards in a sharded environment as follows:

Create a sharding strategy to select a subset of IndexManagers depending on filter configurations.
Activate the filter when running the query.

The following is an example of sharding strategy that queries a specific shard if the customer filter is activated:

Example 5.26. Querying a Specific Shard

public class CustomerShardingStrategy implements IndexShardingStrategy {

    // stored IndexManagers in a array indexed by customerID
    private IndexManager[] indexManagers;
 
    public void initialize(Properties properties, IndexManager[] indexManagers) {
        this.indexManagers = indexManagers;
    }

    public IndexManager[] getIndexManagersForAllShards() {
        return indexManagers;
    }

    public IndexManager getIndexManagerForAddition(
        Class<?> entity, Serializable id, String idInString, Document document) {
        Integer customerID = Integer.parseInt(document.getFieldable("customerID")
                                                      .stringValue());
        return indexManagers[customerID];
    }
    
    public IndexManager[] getIndexManagersForDeletion(
        Class<> entity, Serializable id, String idInString) {
        return getIndexManagersForAllShards();
    }

    /**
     * Optimization; don't search ALL shards and union the results; in this case, we 
     * can be certain that all the data for a particular customer Filter is in a single
     * shard; return that shard by customerID.
     */
    public IndexManager[] getIndexManagersForQuery(
        FullTextFilterImplementor[] filters) {
        FullTextFilter filter = getCustomerFilter(filters, "customer");
        if (filter == null) {
            return getIndexManagersForAllShards();
        }
        else {
            return new IndexManager[] { indexManagers[Integer.parseInt(
                filter.getParameter("customerID").toString())] };
        }
    }

    private FullTextFilter getCustomerFilter(FullTextFilterImplementor[] filters, 
                                             String name) {
        for (FullTextFilterImplementor filter: filters) {
            if (filter.getName().equals(name)) return filter;
        }
        return null;
    }
}

public class CustomerShardingStrategy implements IndexShardingStrategy {

    // stored IndexManagers in a array indexed by customerID
    private IndexManager[] indexManagers;
 
    public void initialize(Properties properties, IndexManager[] indexManagers) {
        this.indexManagers = indexManagers;
    }

    public IndexManager[] getIndexManagersForAllShards() {
        return indexManagers;
    }

    public IndexManager getIndexManagerForAddition(
        Class<?> entity, Serializable id, String idInString, Document document) {
        Integer customerID = Integer.parseInt(document.getFieldable("customerID")
                                                      .stringValue());
        return indexManagers[customerID];
    }
    
    public IndexManager[] getIndexManagersForDeletion(
        Class<> entity, Serializable id, String idInString) {
        return getIndexManagersForAllShards();
    }

    /**
     * Optimization; don't search ALL shards and union the results; in this case, we 
     * can be certain that all the data for a particular customer Filter is in a single
     * shard; return that shard by customerID.
     */
    public IndexManager[] getIndexManagersForQuery(
        FullTextFilterImplementor[] filters) {
        FullTextFilter filter = getCustomerFilter(filters, "customer");
        if (filter == null) {
            return getIndexManagersForAllShards();
        }
        else {
            return new IndexManager[] { indexManagers[Integer.parseInt(
                filter.getParameter("customerID").toString())] };
        }
    }

    private FullTextFilter getCustomerFilter(FullTextFilterImplementor[] filters, 
                                             String name) {
        for (FullTextFilterImplementor filter: filters) {
            if (filter.getName().equals(name)) return filter;
        }
        return null;
    }
}

Copy to Clipboard

Toggle word wrap

If the customer filter is present in the example, the query only uses the shard dedicated to the customer. The query returns all shards if the customer filter is not found. The sharding strategy reacts to each filter depending on the provided parameters.

Activate the filter when the query must be run. The filter is a regular filter (as defined in Section 5.3, “Filters”), which filters Lucene results after the query. As an alternate, use a special filter that is passed to the sharding strategy and then ignored for duration of the query. Use the ShardSensitiveOnlyFilter class to declare the filter.

Example 5.27. Using the ShardSensitiveOnlyFilter Class

@Indexed
@FullTextFilterDef(name = "customer", impl = ShardSensitiveOnlyFilter.class)
public class Customer {
   ...
}

CacheQuery cacheQuery = Search.getSearchManager(cache).getQuery(query,
    Customer.class);
cacheQuery.enableFulltextFilter("customer").setParameter("CustomerID", 5);
@SuppressWarnings("unchecked")
List results = query.List();

@Indexed
@FullTextFilterDef(name = "customer", impl = ShardSensitiveOnlyFilter.class)
public class Customer {
   ...
}

CacheQuery cacheQuery = Search.getSearchManager(cache).getQuery(query,
    Customer.class);
cacheQuery.enableFulltextFilter("customer").setParameter("CustomerID", 5);
@SuppressWarnings("unchecked")
List results = query.List();

Copy to Clipboard

Toggle word wrap

If the ShardSensitiveOnlyFilter filter is used, Lucene filters do not need to be implemented. Use filters and sharding strategies reacting to these filters for faster query execution in a sharded environment.

Report a bug

此内容没有您所选择的语言版本。

5.3. Filters

5.3.1. Defining and Implementing a Filter
复制链接

5.3.2. The @Factory Filter
复制链接

5.3.3. Key Objects
复制链接

5.3.4. Full Text Filter
复制链接

5.3.5. Using Filters in a Sharded Environment
复制链接

学习

尝试、购买和销售

社区

关于红帽文档

让开源更具包容性

關於紅帽

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

此内容没有您所选择的语言版本。

5.3. Filters

5.3.1. Defining and Implementing a Filter复制链接链接已复制到粘贴板!

5.3.2. The @Factory Filter复制链接链接已复制到粘贴板!

5.3.3. Key Objects复制链接链接已复制到粘贴板!

5.3.4. Full Text Filter复制链接链接已复制到粘贴板!

5.3.5. Using Filters in a Sharded Environment复制链接链接已复制到粘贴板!

学习

尝试、购买和销售

社区

关于红帽文档

让开源更具包容性

關於紅帽

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

5.3.1. Defining and Implementing a Filter
复制链接

5.3.2. The @Factory Filter
复制链接

5.3.3. Key Objects
复制链接

5.3.4. Full Text Filter
复制链接

5.3.5. Using Filters in a Sharded Environment
复制链接