Chapter 5. Querying
Procedure 5.1. Prepare and Execute a Query
- Get
SearchManager
of an indexing enabled cache as follows:SearchManager manager = Search.getSearchManager(cache);
- Create a
QueryBuilder
to build queries forMyth.class
as follows:final org.hibernate.search.query.dsl.QueryBuilder queryBuilder = manager.buildQueryBuilderForClass(Myth.class).get();
- Create an Apache Lucene query that queries the
Myth.class
class' atributes as follows:org.apache.lucene.search.Query query = queryBuilder.keyword() .onField("history").boostedTo(3) .matching("storm") .createQuery(); // wrap Lucene query in a org.infinispan.query.CacheQuery CacheQuery cacheQuery = manager.getQuery(query); // Get query result List<Object> result = cacheQuery.list();
5.1. Building Queries
5.1.1. Building a Lucene Query Using the Lucene-based Query API
5.1.2. Building a Lucene Query
QueryBuilder
for this task.
- Method names are in English. As a result, API operations can be read and understood as a series of English phrases and instructions.
- It uses IDE autocompletion which helps possible completions for the current input prefix and allows the user to choose the right option.
- It often uses the chaining method pattern.
- It is easy to use and read the API operations.
QueryBuilder
knows what analyzer to use and what field bridge to apply. Several QueryBuilder
s (one for each type involved in the root of your query) can be created. The QueryBuilder
is derived from the SearchFactory
.
Search.getSearchManager(cache).buildQueryBuilderForClass(Myth.class).get();
SearchFactory searchFactory = Search.getSearchManager(cache).getSearchFactory(); QueryBuilder mythQB = searchFactory.buildQueryBuilder() .forEntity(Myth.class) .overridesForField("history","stem_analyzer_definition") .get();
5.1.2.1. Keyword Queries
Example 5.1. Keyword Search
Query luceneQuery = mythQB.keyword().onField("history").matching("storm").createQuery();
Parameter | Description |
---|---|
keyword() | Use this parameter to find a specific word |
onField() | Use this parameter to specify in which lucene field to search the word |
matching() | use this parameter to specify the match for search string |
createQuery() | creates the Lucene query object |
- The value "storm" is passed through the
history
FieldBridge
. This is useful when numbers or dates are involved. - The field bridge value is then passed to the analyzer used to index the field
history
. This ensures that the query uses the same term transformation than the indexing (lower case, ngram, stemming and so on). If the analyzing process generates several terms for a given word, a boolean query is used with theSHOULD
logic (roughly anOR
logic).
@Indexed public class Myth { @Field(analyze = Analyze.NO) @DateBridge(resolution = Resolution.YEAR) public Date getCreationDate() { return creationDate; } public Date setCreationDate(Date creationDate) { this.creationDate = creationDate; } private Date creationDate; } Date birthdate = ...; Query luceneQuery = mythQb.keyword() .onField("creationDate") .matching(birthdate) .createQuery();
Note
Date
object had to be converted to its string representation (in this case the year)
FieldBridge
has an objectToString
method (and all built-in FieldBridge
implementations do).
Example 5.2. Searching Using Ngram Analyzers
@AnalyzerDef(name = "ngram", tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class), filters = { @TokenFilterDef(factory = StandardFilterFactory.class), @TokenFilterDef(factory = LowerCaseFilterFactory.class), @TokenFilterDef(factory = StopFilterFactory.class), @TokenFilterDef(factory = NGramFilterFactory.class, params = { @Parameter(name = "minGramSize", value = "3"), @Parameter(name = "maxGramSize", value = "3")}) }) public class Myth { @Field(analyzer = @Analyzer(definition = "ngram")) public String getName() { return name; } public String setName(String name) { this.name = name; } private String name; } Date birthdate = ...; Query luceneQuery = mythQb.keyword() .onField("name") .matching("Sisiphus") .createQuery();
y
). All that is transparently done for the user.
Note
ignoreAnalyzer()
or ignoreFieldBridge()
functions can be called.
Example 5.3. Searching for Multiple Words
//search document with storm or lightning in their history Query luceneQuery = mythQB.keyword().onField("history").matching("storm lightning").createQuery();
onFields
method.
Example 5.4. Searching Multiple Fields
Query luceneQuery = mythQB .keyword() .onFields("history","description","name") .matching("storm") .createQuery();
andField()
method.
Example 5.5. Using the andField
Method
Query luceneQuery = mythQB.keyword() .onField("history") .andField("name") .boostedTo(5) .andField("description") .matching("storm") .createQuery();
5.1.2.2. Fuzzy Queries
keyword
query and add the fuzzy flag.
Example 5.6. Fuzzy Query
Query luceneQuery = mythQB.keyword() .fuzzy() .withThreshold(.8f) .withPrefixLength(1) .onField("history") .matching("starm") .createQuery();
threshold
is the limit above which two terms are considering matching. It is a decimal between 0 and 1 and the default value is 0.5. The prefixLength
is the length of the prefix ignored by the "fuzzyness". While the default value is 0, a non zero value is recommended for indexes containing a huge amount of distinct terms.
5.1.2.3. Wildcard Queries
?
represents a single character and *
represents any character sequence. Note that for performance purposes, it is recommended that the query does not start with either ?
or *
.
Example 5.7. Wildcard Query
Query luceneQuery = mythQB.keyword() .wildcard() .onField("history") .matching("sto*") .createQuery();
Note
*
or ?
being mangled is too high.
5.1.2.4. Phrase Queries
phrase()
to do so.
Example 5.8. Phrase Query
Query luceneQuery = mythQB.phrase() .onField("history") .sentence("Thou shalt not kill") .createQuery();
Example 5.9. Adding Slop Factor
Query luceneQuery = mythQB.phrase() .withSlop(3) .onField("history") .sentence("Thou kill") .createQuery();
5.1.2.5. Range Queries
Example 5.10. Range Query
//look for 0 <= starred < 3 Query luceneQuery = mythQB.range() .onField("starred") .from(0).to(3).excludeLimit() .createQuery(); //look for myths strictly BC Date beforeChrist = ...; Query luceneQuery = mythQB.range() .onField("creationDate") .below(beforeChrist).excludeLimit() .createQuery();
5.1.2.6. Combining Queries
SHOULD
: the query should contain the matching elements of the subquery.MUST
: the query must contain the matching elements of the subquery.MUST NOT
: the query must not contain the matching elements of the subquery.
Example 5.11. Combining Subqueries
//look for popular modern myths that are not urban Date twentiethCentury = ...; Query luceneQuery = mythQB.bool() .must(mythQB.keyword().onField("description").matching("urban").createQuery()) .not() .must(mythQB.range().onField("starred").above(4).createQuery()) .must(mythQB.range() .onField("creationDate") .above(twentiethCentury) .createQuery()) .createQuery(); //look for popular myths that are preferably urban Query luceneQuery = mythQB .bool() .should(mythQB.keyword() .onField("description") .matching("urban") .createQuery()) .must(mythQB.range().onField("starred").above(4).createQuery()) .createQuery(); //look for all myths except religious ones Query luceneQuery = mythQB.all() .except(mythQb.keyword() .onField("description_stem") .matching("religion") .createQuery()) .createQuery();
5.1.2.7. Query Options
boostedTo
(on query type and on field) boosts the query or field to a provided factor.withConstantScore
(on query) returns all results that match the query and have a constant score equal to the boost.filteredBy(Filter)
(on query) filters query results using theFilter
instance.ignoreAnalyzer
(on field) ignores the analyzer when processing this field.ignoreFieldBridge
(on field) ignores the field bridge when processing this field.
Example 5.12. Querying Options
Query luceneQuery = mythQB .bool() .should(mythQB.keyword().onField("description").matching("urban").createQuery()) .should(mythQB .keyword() .onField("name") .boostedTo(3) .ignoreAnalyzer() .matching("urban").createQuery()) .must(mythQB .range() .boostedTo(5) .withConstantScore() .onField("starred") .above(4).createQuery()) .createQuery();
5.1.3. Build a Query with Infinispan Query
5.1.3.1. Generality
Example 5.13. Wrapping a Lucene Query in an Infinispan CacheQuery
CacheQuery cacheQuery = Search.getSearchManager(cache).getQuery(luceneQuery);
Example 5.14. Filtering the Search Result by Entity Type
CacheQuery cacheQuery = Search.getSearchManager(cache).getQuery(luceneQuery, Customer.class); // or CacheQuery cacheQuery = Search.getSearchManager(cache).getQuery(luceneQuery, Item.class, Actor.class);
Customer
instances. The second part of the same example returns matching Actor
and Item
instances. The type restriction is polymorphic. As a result, if the two subclasses Salesman
and Customer
of the base class Person
return, specify Person.class
to filter based on result types.
5.1.3.2. Pagination
Example 5.15. Defining pagination for a search query
CacheQuery cacheQuery = Search.getSearchManager(cache) .getQuery(luceneQuery, Customer.class); cacheQuery.firstResult(15); //start from the 15th element cacheQuery.maxResults(10); //return 10 elements
Note
cacheQuery.getResultSize()
.
5.1.3.3. Sorting
Example 5.16. Specifying a Lucene Sort
org.infinispan.query.CacheQuery cacheQuery = Search.getSearchManager(cache).getQuery(luceneQuery, Book.class); org.apache.lucene.search.Sort sort = new Sort( new SortField("title", SortField.STRING)); cacheQuery.sort(sort); List results = cacheQuery.list();
Note
5.1.3.4. Projection
Example 5.17. Using Projection Instead of Returning the Full Domain Object
SearchManager searchManager = Search.getSearchManager(cache); CacheQuery cacheQuery = searchManager.getQuery(luceneQuery, Book.class); cacheQuery.projection("id", "summary", "body", "mainAuthor.name"); List results = cacheQuery.list(); Object[] firstResult = (Object[]) results.get(0); Integer id = (Integer) firstResult[0]; String summary = (String) firstResult[1]; String body = (String) firstResult[2]; String authorName = (String) firstResult[3];
Object[]
. Projections prevent a time consuming database round-trip. However, they have following constraints:
- The properties projected must be stored in the index (
@Field(store=Store.YES)
), which increases the index size. - The properties projected must use a
FieldBridge
implementingorg.infinispan.query.bridge.TwoWayFieldBridge
ororg.infinispan.query.bridge.TwoWayStringBridge
, the latter being the simpler version.Note
All Lucene-based Query API built-in types are two-way. - Only the simple properties of the indexed entity or its embedded associations can be projected. Therefore a whole embedded entity cannot be projected.
- Projection does not work on collections or maps which are indexed via
@IndexedEmbedded
Example 5.18. Using Projection to Retrieve Metadata
SearchManager searchManager = Search.getSearchManager(cache); CacheQuery cacheQuery = searchManager.getQuery(luceneQuery, Book.class); cacheQuery.projection("mainAuthor.name"); List results = cacheQuery.list(); Object[] firstResult = (Object[]) results.get(0); float score = (Float) firstResult[0]; Book book = (Book) firstResult[1]; String authorName = (String) firstResult[2];
FullTextQuery.THIS
returns the initialized and managed entity as a non-projected query does.FullTextQuery.DOCUMENT
returns the Lucene Document related to the projected object.FullTextQuery.OBJECT_CLASS
returns the indexed entity's class.FullTextQuery.SCORE
returns the document score in the query. Use scores to compare one result against another for a given query. However, scores are not relevant to compare the results of two different queries.FullTextQuery.ID
is the ID property value of the projected object.FullTextQuery.DOCUMENT_ID
is the Lucene document ID. The Lucene document ID changes between two IndexReader openings.FullTextQuery.EXPLANATION
returns the Lucene Explanation object for the matching object/document in the query. This is not suitable for retrieving large amounts of data. RunningFullTextQuery.EXPLANATION
is as expensive as running a Lucene query for each matching element. As a result, projection is recommended.
5.1.3.5. Limiting the Time of a Query
- Raise an exception when arriving at the limit.
- Limit to the number of results retrieved when the time limit is raised.
5.1.3.6. Raise an Exception on Time Limit
Example 5.19. Defining a Timeout in Query Execution
SearchManagerImplementor searchManager = (SearchManagerImplementor) Search.getSearchManager(cache); searchManager.setTimeoutExceptionFactory(new MyTimeoutExceptionFactory()); CacheQuery cacheQuery = searchManager.getQuery(luceneQuery, Book.class); //define the timeout in seconds cacheQuery.timeout(2, TimeUnit.SECONDS) try { query.list(); } catch (MyTimeoutException e) { //do something, too slow } private static class MyTimeoutExceptionFactory implements TimeoutExceptionFactory { @Override public RuntimeException createTimeoutException(String message, Query query) { return new MyTimeoutException(); } } public static class MyTimeoutException extends RuntimeException { }
getResultSize()
, iterate()
and scroll()
honor the timeout until the end of the method call. As a result, Iterable
or the ScrollableResults
ignore the timeout. Additionally, explain()
does not honor this timeout period. This method is used for debugging and to check the reasons for slow performance of a query.
Important