Chapter 3. Querying remote caches
You can index and query remote caches on Data Grid Server.
3.1. Querying caches from Hot Rod Java clients
Data Grid lets you programmatically query remote caches from Java clients through the Hot Rod endpoint. This procedure explains how to index query a remote cache that stores Book
instances.
Prerequisites
-
Add the ProtoStream processor to your
pom.xml
.
Data Grid provides this processor for the @ProtoField
and @ProtoDoc
annotations so you can generate Protobuf schemas and perform queries.
<dependencyManagement> <dependencies> <dependency> <groupId>org.infinispan</groupId> <artifactId>infinispan-bom</artifactId> <version>${version.infinispan}</version> <type>pom</type> </dependency> </dependencies> </dependencyManagement> <dependencies> <dependency> <groupId>org.infinispan.protostream</groupId> <artifactId>protostream-processor</artifactId> <scope>provided</scope> </dependency> </dependencies>
Procedure
Add indexing annotations to your class, as in the following example:
Book.java
import org.infinispan.protostream.annotations.ProtoDoc; import org.infinispan.protostream.annotations.ProtoFactory; import org.infinispan.protostream.annotations.ProtoField; @ProtoDoc("@Indexed") public class Book { @ProtoDoc("@Field(index=Index.YES, analyze = Analyze.YES, store = Store.NO)") @ProtoField(number = 1) final String title; @ProtoDoc("@Field(index=Index.YES, analyze = Analyze.YES, store = Store.NO)") @ProtoField(number = 2) final String description; @ProtoDoc("@Field(index=Index.YES, analyze = Analyze.YES, store = Store.NO)") @ProtoField(number = 3, defaultValue = "0") final int publicationYear; @ProtoFactory Book(String title, String description, int publicationYear) { this.title = title; this.description = description; this.publicationYear = publicationYear; } // public Getter methods omitted for brevity }
Implement the
SerializationContextInitializer
interface in a new class and then add the@AutoProtoSchemaBuilder
annotation.-
Reference the class that includes the
@ProtoField
and@ProtoDoc
annotations with theincludeClasses
parameter. -
Define a name for the Protobuf schema that you generate and filesystem path with the
schemaFileName
andschemaFilePath
parameters. Specify the package name for the Protobuf schema with the
schemaPackageName
parameter.RemoteQueryInitializer.java
import org.infinispan.protostream.SerializationContextInitializer; import org.infinispan.protostream.annotations.AutoProtoSchemaBuilder; @AutoProtoSchemaBuilder( includeClasses = { Book.class }, schemaFileName = "book.proto", schemaFilePath = "proto/", schemaPackageName = "book_sample") public interface RemoteQueryInitializer extends SerializationContextInitializer { }
-
Reference the class that includes the
Compile your project.
The code examples in this procedure generate a
proto/book.proto
schema and anRemoteQueryInitializerImpl.java
implementation of the annotatedBook
class.
Next steps
Create a remote cache that configures Data Grid to index your entities. For example, the following remote cache indexes the Book
entity in the book.proto
schema that you generated in the previous step:
<replicated-cache name="books"> <indexing> <indexed-entities> <indexed-entity>book_sample.Book</indexed-entity> </indexed-entities> </indexing> </replicated-cache>
The following RemoteQuery
class does the following:
-
Registers the
RemoteQueryInitializerImpl
serialization context with a Hot Rod Java client. -
Registers the Protobuf schema,
book.proto
, with Data Grid Server. -
Adds two
Book
instances to the remote cache. - Performs a full-text query that matches books by keywords in the title.
RemoteQuery.java
package org.infinispan; import java.nio.file.Files; import java.nio.file.Path; import java.nio.file.Paths; import java.util.List; import org.infinispan.client.hotrod.RemoteCache; import org.infinispan.client.hotrod.RemoteCacheManager; import org.infinispan.client.hotrod.Search; import org.infinispan.client.hotrod.configuration.ConfigurationBuilder; import org.infinispan.query.dsl.Query; import org.infinispan.query.dsl.QueryFactory; import org.infinispan.query.remote.client.ProtobufMetadataManagerConstants; public class RemoteQuery { public static void main(String[] args) throws Exception { ConfigurationBuilder clientBuilder = new ConfigurationBuilder(); // RemoteQueryInitializerImpl is generated clientBuilder.addServer().host("127.0.0.1").port(11222) .security().authentication().username("user").password("user") .addContextInitializers(new RemoteQueryInitializerImpl()); RemoteCacheManager remoteCacheManager = new RemoteCacheManager(clientBuilder.build()); // Grab the generated protobuf schema and registers in the server. Path proto = Paths.get(RemoteQuery.class.getClassLoader() .getResource("proto/book.proto").toURI()); String protoBufCacheName = ProtobufMetadataManagerConstants.PROTOBUF_METADATA_CACHE_NAME; remoteCacheManager.getCache(protoBufCacheName).put("book.proto", Files.readString(proto)); // Obtain the 'books' remote cache RemoteCache<Object, Object> remoteCache = remoteCacheManager.getCache("books"); // Add some Books Book book1 = new Book("Infinispan in Action", "Learn Infinispan with using it", 2015); Book book2 = new Book("Cloud-Native Applications with Java and Quarkus", "Build robust and reliable cloud applications", 2019); remoteCache.put(1, book1); remoteCache.put(2, book2); // Execute a full-text query QueryFactory queryFactory = Search.getQueryFactory(remoteCache); Query<Book> query = queryFactory.create("FROM book_sample.Book WHERE title:'java'"); List<Book> list = query.execute().list(); // Voila! We have our book back from the cache! } }
Additional resources
- Marshalling and Encoding Data for more information about creating serialization contexts and registering Protobuf schema.
-
ProtoStream annotations for more information about the
@ProtoField
,@ProtoDoc
, and@AutoProtoSchemaBuilder
annotations.
3.2. Querying caches from Data Grid Console and CLI
Data Grid Console and the Data Grid Command Line Interface (CLI) let you query indexed and non-indexed remote caches. You can also use any HTTP client to index and query caches via the REST API.
This procedure explains how to index and query a remote cache that stores Person
instances.
Prerequisites
- Have at least one running Data Grid Server instance.
- Have Data Grid credentials with create permissions.
Procedure
Add indexing annotations to your Protobuf schema, as in the following example:
package org.infinispan.example; /* @Indexed */ message Person { /* @Field(index=Index.YES, store = Store.NO, analyze = Analyze.NO) */ optional int32 id = 1; /* @Field(index=Index.YES, store = Store.YES, analyze = Analyze.NO) */ required string name = 2; /* @Field(index=Index.YES, store = Store.YES, analyze = Analyze.NO) */ required string surname = 3; /* @Field(index=Index.YES, store = Store.YES, analyze = Analyze.NO) */ optional int32 age = 6; }
From the Data Grid CLI, use the
schema
command with the--upload=
argument as follows:schema --upload=person.proto person.proto
Create a cache named people that uses ProtoStream encoding and configures Data Grid to index entities declared in your Protobuf schema.
The following cache indexes the
Person
entity from the previous step:<distributed-cache name="people"> <encoding media-type="application/x-protostream"/> <indexing> <indexed-entities> <indexed-entity>org.infinispan.example.Person</indexed-entity> </indexed-entities> </indexing> </distributed-cache>
From the CLI, use the
create cache
command with the--file=
argument as follows:create cache --file=people.xml people
Add entries to the cache.
To query a remote cache, it needs to contain some data. For this example procedure, create entries that use the following JSON values:
PersonOne
{ "_type":"org.infinispan.example.Person", "id":1, "name":"Person", "surname":"One", "age":44 }
PersonTwo
{ "_type":"org.infinispan.example.Person", "id":2, "name":"Person", "surname":"Two", "age":27 }
PersonThree
{ "_type":"org.infinispan.example.Person", "id":3, "name":"Person", "surname":"Three", "age":35 }
From the CLI, use the
put
command with the--file=
argument to add each entry, as follows:put --encoding=application/json --file=personone.json personone
TipFrom Data Grid Console, you must select Custom Type for the Value content type field when you add values in JSON format with custom types .
Query your remote cache.
From the CLI, use the
query
command from the context of the remote cache.query "from org.infinispan.example.Person p WHERE p.name='Person' ORDER BY p.age ASC"
The query returns all entries with a name that matches
Person
by age in ascending order.
Additional resources
3.3. Using analyzers with remote caches
Analyzers convert input data into terms that you can index and query. You specify analyzer definitions with the @Field
annotation in your Java classes or directly in Protobuf schema.
Procedure
-
Include the
Analyze.YES
attribute to indicate that the property is analyzed. -
Specify the analyzer definition with the
@Analyzer
annotation.
Protobuf schema
/* @Indexed */ message TestEntity { /* @Field(store = Store.YES, analyze = Analyze.YES, analyzer = @Analyzer(definition = "keyword")) */ optional string id = 1; /* @Field(store = Store.YES, analyze = Analyze.YES, analyzer = @Analyzer(definition = "simple")) */ optional string name = 2; }
Java classes
@ProtoDoc("@Field(store = Store.YES, analyze = Analyze.YES, analyzer = @Analyzer(definition = \"keyword\"))") @ProtoField(1) final String id; @ProtoDoc("@Field(store = Store.YES, analyze = Analyze.YES, analyzer = @Analyzer(definition = \"simple\"))") @ProtoField(2) final String description;
3.3.1. Default analyzer definitions
Data Grid provides a set of default analyzer definitions.
Definition | Description |
---|---|
| Splits text fields into tokens, treating whitespace and punctuation as delimiters. |
| Tokenizes input streams by delimiting at non-letters and then converting all letters to lowercase characters. Whitespace and non-letters are discarded. |
| Splits text streams on whitespace and returns sequences of non-whitespace characters as tokens. |
| Treats entire text fields as single tokens. |
| Stems English words using the Snowball Porter filter. |
| Generates n-gram tokens that are 3 grams in size by default. |
|
Splits text fields into larger size tokens than the |
These analyzer definitions are based on Apache Lucene and are provided "as-is". For more information about tokenizers, filters, and CharFilters, see the appropriate Lucene documentation.
3.3.2. Creating custom analyzer definitions
Create custom analyzer definitions and add them to your Data Grid Server installations.
Prerequisites
Stop Data Grid Server if it is running.
Data Grid Server loads classes at startup only.
Procedure
-
Implement the
ProgrammaticSearchMappingProvider
API. Package your implementation in a JAR with the fully qualified class (FQN) in the following file:
META-INF/services/org.infinispan.query.spi.ProgrammaticSearchMappingProvider
-
Copy your JAR file to the
server/lib
directory of your Data Grid Server installation. - Start Data Grid Server.
ProgrammaticSearchMappingProvider
example
import org.apache.lucene.analysis.core.LowerCaseFilterFactory; import org.apache.lucene.analysis.core.StopFilterFactory; import org.apache.lucene.analysis.standard.StandardFilterFactory; import org.apache.lucene.analysis.standard.StandardTokenizerFactory; import org.hibernate.search.cfg.SearchMapping; import org.infinispan.Cache; import org.infinispan.query.spi.ProgrammaticSearchMappingProvider; public final class MyAnalyzerProvider implements ProgrammaticSearchMappingProvider { @Override public void defineMappings(Cache cache, SearchMapping searchMapping) { searchMapping .analyzerDef("standard-with-stop", StandardTokenizerFactory.class) .filter(StandardFilterFactory.class) .filter(LowerCaseFilterFactory.class) .filter(StopFilterFactory.class); } }