Chapter 20. Remote Querying
20.1. Remote Querying
Red Hat JBoss Data Grid’s Hot Rod protocol allows remote, language neutral querying, using either the Infinispan Query Domain-specific Language (DSL), or Ickle, a subset of JP-QL. Querying in either method allows remote, language-neutral querying, and is implementable in all languages currently available for the Hot Rod client.
The Infinispan Query Domain-specific Language
JBoss Data Grid uses its own query language based on an internal DSL. The Infinispan Query DSL provides a simplified way of writing queries, and is agnostic of the underlying query mechanisms. Additional information on the Infinispan Query DSL is available at The Infinispan Query DSL.
Ickle
Ickle is a string based query language allowing full-text and relational searches. Additional information on Ickle is available at Building a Query using Ickle, the JP-QL API.
Protobuf Encoding
Google’s Protocol Buffers is used as an encoding format for both storing and querying data. The Infinispan Query DSL can be used remotely via the Hot Rod client that is configured to use the Protobuf marshaller. Protocol Buffers are used to adopt a common format for storing cache entries and marshalling them. Remote clients that need to index and query their stored entities must use the Protobuf encoding format. It is also possible to store Protobuf entities for the benefit of platform independence without indexing enabled if it is not required.
20.2. Querying Comparison
In Library mode, both Lucene Query-based and DSL querying is available. In Remote Client-Server mode, only Remote Querying using DSL is available. The following table is a feature comparison between Lucene Query-based querying, Infinispan Query DSL and Remote Querying.
Feature | Library Mode/Lucene Query | Library Mode/DSL Query | Remote Client-Server Mode/DSL Query | Library Mode/Ickle Query | Remote Client-Server Mode/Ickle Query |
---|---|---|---|---|---|
Indexing | Mandatory | Optional but highly recommended | Optional but highly recommended | Optional but highly recommended | Optional but highly recommended |
Index contents | Selected fields | Selected fields | Selected fields | Selected fields | Selected fields |
Data Storage Format | Java objects | Java objects | Protocol buffers | Java objects | Protocol buffers |
Keyword Queries | Yes | No | No | Yes | Yes |
Range Queries | Yes | Yes | Yes | Yes | Yes |
Fuzzy Queries | Yes | No | No | Yes | Yes |
Wildcard | Yes | Limited to like queries (Matches a wildcard pattern that follows JPA rules). | Limited to like queries (Matches a wildcard pattern that follows JPA rules). | Yes | Yes |
Phrase Queries | Yes | No | No | Yes | Yes |
Combining Queries | AND, OR, NOT, SHOULD | AND, OR, NOT | AND, OR, NOT | AND, OR, NOT | AND, OR, NOT |
Sorting Results | Yes | Yes | Yes | Yes | Yes |
Filtering Results | Yes, both within the query and as appended operator | Within the query | Within the query | Within the query | Within the query |
Pagination of Results | Yes | Yes | Yes | Yes | Yes |
Continuous Queries | No | Yes | Yes | No | No |
Query Aggregation Operations | No | Yes | Yes | Yes | Yes |
20.3. Performing Remote Queries via the Hot Rod Java Client
Remote querying over Hot Rod can be enabled once the RemoteCacheManager
has been configured with the Protobuf marshaller.
The following procedure describes how to enable remote querying over its caches.
Prerequisites
RemoteCacheManager
must be configured to use the Protobuf Marshaller.
Enabling Remote Querying via Hot Rod
Add the infinispan-remote.jar
The infinispan-remote.jar is an uberjar, and therefore no other dependencies are required for this feature.
Enable indexing on the cache configuration
Indexing is not mandatory for Remote Queries, but it is highly recommended because it makes searches on caches that contain large amounts of data significantly faster. Indexing can be configured at any time. Enabling and configuring indexing is the same as for Library mode.
Add the following configuration within the
cache-container
element loated inside the Infinispan subsystem element.<!-- A basic example of an indexed local cache that uses the RAM Lucene directory provider --> <local-cache name="an-indexed-cache"> <!-- Enable indexing using the RAM Lucene directory provider --> <indexing index="ALL"> <property name="default.directory_provider">ram</property> </indexing> </local-cache>
Register the Protobuf schema definition files
Register the Protobuf schema definition files by adding them in the
___protobuf_metadata
system cache. The cache key is a string that denotes the file name and the value is .proto file, as a string. Alternatively, protobuf schemas can also be registered by invoking theregisterProtofile
methods of the server’sProtobufMetadataManager
MBean. There is one instance of this MBean per cache container and is backed by the___protobuf_metadata
, so that the two approaches are equivalent.For an example of providing the protobuf schema via
___protobuf_metadata
system cache, see Registering a Protocol Buffers schema file.NoteWriting to the
___protobuf_metadata
cache requires the ___schema_manager role be added to the user performing the write.The following example demonstrates how to invoke the
registerProtofile
methods of theProtobufMetadataManager
MBean.Registering Protobuf schema definition files via JMX
import javax.management.MBeanServerConnection; import javax.management.ObjectName; import javax.management.remote.JMXConnector; import javax.management.remote.JMXServiceURL; ... String serverHost = ... // The address of your JDG server int serverJmxPort = ... // The JMX port of your server String cacheContainerName = ... // The name of your cache container String schemaFileName = ... // The name of the schema file String schemaFileContents = ... // The Protobuf schema file contents JMXConnector jmxConnector = JMXConnectorFactory.connect(new JMXServiceURL( "service:jmx:remoting-jmx://" + serverHost + ":" + serverJmxPort)); MBeanServerConnection jmxConnection = jmxConnector.getMBeanServerConnection(); ObjectName protobufMetadataManagerObjName = new ObjectName("jboss.infinispan:type=RemoteQuery,name=" + ObjectName.quote(cacheContainerName) + ",component=ProtobufMetadataManager"); jmxConnection.invoke(protobufMetadataManagerObjName, "registerProtofile", new Object[]{schemaFileName, schemaFileContents}, new String[]{String.class.getName(), String.class.getName()}); jmxConnector.close();
Result
All data placed in the cache is immediately searchable, whether or not indexing is in use. Entries do not need to be annotated, unlike embedded queries. The entity classes are only meaningful to the Java client and do not exist on the server.
Once remote querying has been enabled, the QueryFactory
can be obtained using the following:
Obtaining the QueryFactory
import org.infinispan.client.hotrod.Search; import org.infinispan.query.dsl.QueryFactory; import org.infinispan.query.dsl.Query; import org.infinispan.query.dsl.SortOrder; ... remoteCache.put(2, new User("John", 33)); remoteCache.put(3, new User("Alfred", 40)); remoteCache.put(4, new User("Jack", 56)); remoteCache.put(4, new User("Jerry", 20)); QueryFactory qf = Search.getQueryFactory(remoteCache); Query query = qf.from(User.class) .orderBy("age", SortOrder.ASC) .having("name").like("J%") .and().having("age").gte(33) .build(); List<User> list = query.list(); assertEquals(2, list.size()); assertEquals("John", list.get(0).getName()); assertEquals(33, list.get(0).getAge()); assertEquals("Jack", list.get(1).getName()); assertEquals(56, list.get(1).getAge());
Queries can now be run over Hot Rod similar to Library mode.
20.4. Remote Querying in the Hot Rod C++ Client
For instructions on using remote querying in the Hot Rod C++ Client refer to Performing Remote Queries in the Hot Rod C++ Client.
20.5. Remote Querying in the Hot Rod C# Client
For instructions on using remote querying in the Hot Rod C# Client refer to Performing Remote Queries in the Hot Rod C# Client.
20.6. Protobuf Encoding
20.6.1. Protobuf Encoding
The Infinispan Query DSL can be used remotely via the Hot Rod client. In order to do this, protocol buffers are used to adopt a common format for storing cache entries and marshalling them.
For more information, see https://developers.google.com/protocol-buffers/docs/overview
20.6.2. Storing Protobuf Encoded Entities
Protobuf requires data to be structured. This is achieved by declaring Protocol Buffer message types in .proto files
For example:
.library.proto
package book_sample; message Book { required string title = 1; required string description = 2; required int32 publicationYear = 3; // no native Date type available in Protobuf repeated Author authors = 4; } message Author { required string name = 1; required string surname = 2; }
The provided example:
An entity named
Book
is placed in a package namedbook_sample
.package book_sample; message Book {
The entity declares several fields of primitive types and a repeatable field named
authors
.required string title = 1; required string description = 2; required int32 publicationYear = 3; // no native Date type available in Protobuf repeated Author authors = 4; }
The
Author
message instances are embedded in theBook
message instance.message Author { required string name = 1; required string surname = 2; }
20.6.3. About Protobuf Messages
There are a few important things to note about Protobuf messages:
- Nesting of messages is possible, however the resulting structure is strictly a tree, and never a graph.
- There is no type inheritance.
- Collections are not supported, however arrays can be easily emulated using repeated fields.
20.6.4. Using Protobuf with Hot Rod
Protobuf can be used with JBoss Data Grid’s Hot Rod using the following two steps:
Configure the client to use a dedicated marshaller, in this case, the
ProtoStreamMarshaller
. This marshaller uses theProtoStream
library to assist in encoding objects.ImportantIf the
infinispan-remote
jar is not in use, then the infinispan-remote-query-client Maven dependency must be added to use theProtoStreamMarshaller
.-
Instruct
ProtoStream
library on how to marshall message types by registering per entity marshallers.
Use the ProtoStreamMarshaller
to Encode and Marshall Messages
import org.infinispan.client.hotrod.configuration.ConfigurationBuilder; import org.infinispan.client.hotrod.marshall.ProtoStreamMarshaller; import org.infinispan.protostream.FileDescriptorSource; import org.infinispan.protostream.SerializationContext; ... ConfigurationBuilder clientBuilder = new ConfigurationBuilder(); clientBuilder.addServer() .host("127.0.0.1").port(11234) .marshaller(new ProtoStreamMarshaller()); RemoteCacheManager remoteCacheManager = new RemoteCacheManager(clientBuilder.build()); SerializationContext serCtx = ProtoStreamMarshaller.getSerializationContext(remoteCacheManager); serCtx.registerProtoFiles(FileDescriptorSource.fromResources("/library.proto")); serCtx.registerMarshaller(new BookMarshaller()); serCtx.registerMarshaller(new AuthorMarshaller()); // Book and Author classes omitted for brevity
In the provided example,
-
The
SerializationContext
is provided by theProtoStream
library. -
The
SerializationContext.registerProtofile
method receives the name of a .proto classpath resource file that contains the message type definitions. -
The
SerializationContext
associated with theRemoteCacheManager
is obtained, thenProtoStream
is instructed to marshall the protobuf types.
A RemoteCacheManager
has no SerializationContext
associated with it unless it was configured to use ProtoStreamMarshaller
.
20.6.5. Registering Per Entity Marshallers
When using the ProtoStreamMarshaller
for remote querying purposes, registration of per entity marshallers for domain model types must be provided by the user for each type or marshalling will fail. When writing marshallers, it is essential that they are stateless and threadsafe, as a single instance of them is being used.
The following example shows how to write a marshaller.
BookMarshaller.java
import org.infinispan.protostream.MessageMarshaller; ... public class BookMarshaller implements MessageMarshaller<Book> { @Override public String getTypeName() { return "book_sample.Book"; } @Override public Class<? extends Book> getJavaClass() { return Book.class; } @Override public void writeTo(ProtoStreamWriter writer, Book book) throws IOException { writer.writeString("title", book.getTitle()); writer.writeString("description", book.getDescription()); writer.writeCollection("authors", book.getAuthors(), Author.class); } @Override public Book readFrom(ProtoStreamReader reader) throws IOException { String title = reader.readString("title"); String description = reader.readString("description"); int publicationYear = reader.readInt("publicationYear"); Set<Author> authors = reader.readCollection("authors", new HashSet<Author>(), Author.class); return new Book(title, description, publicationYear, authors); } }
Once the client has been set up, reading and writing Java objects to the remote cache will use the entity marshallers. The actual data stored in the cache will be protobuf encoded, provided that marshallers were registered with the remote client for all involved types. In the provided example, this would be Book
and Author
.
Objects stored in protobuf format are able to be utilized with compatible clients written in different languages.
20.6.6. Indexing Protobuf Encoded Entities
Once the client is configured to use Protobuf, indexing can be configured for caches on the server side.
To index the entries, the server must have the knowledge of the message types defined by the Protobuf schema. A Protobuf schema file is defined in a file with a .proto extension. The schema is supplied to the server either by placing it in the ___protobuf_metadata
cache by a put
, putAll
, putIfAbsent
, or replace
operation, or alternatively by invoking ProtobufMetadataManager
MBean via JMX. Both keys and values of ___protobuf_metadata
cache are Strings, the key being the file name, while the value is the schema file contents.
Writing to the ___protobuf_metadata
cache requires the ___schema_manager role be added to the user performing the write.
Registering a Protocol Buffers schema file
import org.infinispan.client.hotrod.RemoteCache; import org.infinispan.client.hotrod.RemoteCacheManager; import org.infinispan.query.remote.client.ProtobufMetadataManagerConstants; RemoteCacheManager remoteCacheManager = ... // obtain a RemoteCacheManager // obtain the '__protobuf_metadata' cache RemoteCache<String, String> metadataCache = remoteCacheManager.getCache( ProtobufMetadataManagerConstants.PROTOBUF_METADATA_CACHE_NAME); String schemaFileContents = ... // this is the contents of the schema file metadataCache.put("my_protobuf_schema.proto", schemaFileContents);
The ProtobufMetadataManager
is a cluster-wide replicated repository of Protobuf schema definitions or[path].proto files. For each running cache manager, a separate ProtobufMetadataManager
MBean instance exists, and is backed by the ___protobuf_metadata
cache. The ProtobufMetadataManager
ObjectName uses the following pattern:
<jmx domain>:type=RemoteQuery, name=<cache manager<methodname>putAllname>, component=ProtobufMetadataManager
The following signature is used by the method that registers the Protobuf schema file:
void registerProtofile(String name, String contents)
If indexing is enabled for a cache, all fields of Protobuf-encoded entries are indexed. All Protobuf-encoded entries are searchable, regardless of whether indexing is enabled.
Indexing is recommended for improved performance but is not mandatory when using remote queries. Using indexing improves the searching speed but can also reduce the insert/update speeds due to the overhead required to maintain the index.
20.6.7. Controlling Field Indexing
After you enable indexing for a cache, all Protobuf type fields are indexed and stored by default. However, this indexing can degrade performance and result in inefficient querying for Protobuf message types that contain many fields or very large fields.
You can control which fields are indexed using the @Indexed
and @IndexedField
annotations directly in the Protobuf schema in comment definitions on the last line of the comment before the message or field to annotate.
@Indexed
- Applies to message types only.
-
Has a boolean value. The default value is
true
so specifying@Indexed
has the same result as@Indexed(true)
. -
Lets you specify the fields of the message type which are indexed. Using
@Indexed(false)
indicates that no fields are to be indexed. As a result, the@IndexedField
annotations are ignored.
@IndexedField
- Applies to fields only.
Has two attributes,
index
andstore
. Both attributes default totrue
so specifying@IndexedField
is equivalent to@IndexedField(index=true, store=true)
.-
index
specifies if the field is indexed and used for indexed queries. -
store
specifies if the field is stored in the index, which allows the field to be used for projections.
-
-
Takes effect only if the message type that contains the field is annotated with
@Indexed
.
20.6.7.1. Example of an Annotated Message Type
/* This type is indexed but not all fields are indexed. @Indexed */ message Note { /* This field is indexed but not stored. It can be used for querying but not for projections. @IndexedField(index=true, store=false) */ optional string text = 1; /* This field is indexed and stored. @IndexedField */ optional string author = 2; /* This field is stored but not indexed. It can be used for projections but not for querying. @IndexedField(index=false, store=true) */ optional bool isRead = 3; /* This field is not indexed or stored. @IndexedField(index=false, store=false) */ optional int32 priority; }
@IndexedField
is deprecated in Red Hat JBoss Data Grid 7.2. However, this version of JBoss Data Grid incorrectly throws a warning that the annotation is deprecated. You can ignore the warning and use @IndexedField
.
Alternatively you can use the @Field
annotation that replaces the @IndexedField
annotation. However, this version of JBoss Data Grid does not support the analyze
attribute for the @Field
annotation.
You can replace @IndexedField
annotations with @Field
annotations as follows:
-
@IndexedField
is equivalent to@Field(store=Store.YES)
-
@IndexedField(store=false)
is equivalent to@Field
-
@IndexedField(index=false, store=false)
is equivalent to@Field(index=Index.NO)
20.6.8. Defining Protocol Buffers Schemas With Java Annotations
You can declare Protobuf metadata using Java annotations. Instead of providing a MessageMarshaller
implementation and a .proto schema file, you can add minimal annotations to a Java class and its fields.
The objective of this method is to marshal Java objects to protobuf using the ProtoStream
library. The ProtoStream
library internally generates the marshallar and does not require a manually implemented one. The Java annotations require minimal information such as the Protobuf tag number. The rest is inferred based on common sense defaults ( Protobuf type, Java collection type, and collection element type) and is possible to override.
The auto-generated schema is registered with the SerializationContext
and is also available to the users to be used as a reference to implement domain model classes and marshallers for other languages.
The following are examples of Java annotations
User.Java
package sample; import org.infinispan.protostream.annotations.ProtoEnum; import org.infinispan.protostream.annotations.ProtoEnumValue; import org.infinispan.protostream.annotations.ProtoField; import org.infinispan.protostream.annotations.ProtoMessage; @ProtoMessage(name = "ApplicationUser") public class User { @ProtoEnum(name = "Gender") public enum Gender { @ProtoEnumValue(number = 1, name = "M") MALE, @ProtoEnumValue(number = 2, name = "F") FEMALE } @ProtoField(number = 1, required = true) public String name; @ProtoField(number = 2) public Gender gender; }
Note.Java
package sample; import org.infinispan.protostream.annotations.ProtoDoc; import org.infinispan.protostream.annotations.ProtoField; @ProtoDoc("@Indexed") public class Note { private String text; private User author; @ProtoDoc("@IndexedField(index = true, store = false)") @ProtoField(number = 1) public String getText() { return text; } public void setText(String text) { this.text = text; } @ProtoDoc("@IndexedField") @ProtoField(number = 2) public User getAuthor() { return author; } public void setAuthor(User author) { this.author = author; } }
ProtoSchemaBuilderDemo.Java
import org.infinispan.protostream.SerializationContext; import org.infinispan.protostream.annotations.ProtoSchemaBuilder; import org.infinispan.client.hotrod.RemoteCacheManager; import org.infinispan.client.hotrod.marshall.ProtoStreamMarshaller; ... RemoteCacheManager remoteCacheManager = ... // we have a RemoteCacheManager SerializationContext serCtx = ProtoStreamMarshaller.getSerializationContext(remoteCacheManager); // generate and register a Protobuf schema and marshallers based // on Note class and the referenced classes (User class) ProtoSchemaBuilder protoSchemaBuilder = new ProtoSchemaBuilder(); String generatedSchema = protoSchemaBuilder .fileName("sample_schema.proto") .packageName("sample_package") .addClass(Note.class) .build(serCtx); // the types can be marshalled now assertTrue(serCtx.canMarshall(User.class)); assertTrue(serCtx.canMarshall(Note.class)); assertTrue(serCtx.canMarshall(User.Gender.class)); // display the schema file System.out.println(generatedSchema);
The following is the .proto file that is generated by the ProtoSchemaBuilderDemo.java example.
Sample_Schema.Proto
package sample_package; /* @Indexed */ message Note { /* @IndexedField(index = true, store = false) */ optional string text = 1; /* @IndexedField */ optional ApplicationUser author = 2; } message ApplicationUser { enum Gender { M = 1; F = 2; } required string name = 1; optional Gender gender = 2; }
The following table lists the supported Java annotations with its application and parameters.
Annotation | Applies To | Purpose | Requirement | Parameters |
---|---|---|---|---|
| Class/Field/Enum/Enum member | Specifies the documentation comment that will be attached to the generated Protobuf schema element (message type, field definition, enum type, enum value definition) | Optional | A single String parameter, the documentation text |
| Class | Specifies the name of the generated message type. If missing, the class name if used instead | Optional | name (String), the name of the generated message type; if missing the Java class name is used by default |
| Field, Getter or Setter |
Specifies the Protobuf field number and its Protobuf type. Also indicates if the field is repeated, optional or required and its (optional) default value. If the Java field type is an interface or an abstract class, its actual type must be indicated. If the field is repeatable and the declared collection type is abstract then the actual collection implementation type must be specified. If this annotation is missing, the field is ignored for marshalling (it is transient). A class must have at least one | Required | number (int, mandatory), the Protobuf number type (org.infinispan.protostream.descriptors.Type, optional), the Protobuf type, it can usually be inferred required (boolean, optional)name (String, optional), the Protobuf namejavaType (Class, optional), the actual type, only needed if declared type is abstract collectionImplementation (Class, optional), the actual collection type, only needed if declared type is abstract defaultValue (String, optional), the string must have the proper format according to the Java field type |
| Enum | Specifies the name of the generated enum type. If missing, the Java enum name if used instead | Optional | name (String), the name of the generated enum type; if missing the Java enum name is used by default |
| Enum member | Specifies the numeric value of the corresponding Protobuf enum value | Required | number (int, mandatory), the Protobuf number name (String), the Protobuf name; if missing the name of the Java member is used |
The @ProtoDoc
annotation can be used to provide documentation comments in the generated schema and also allows to inject the @Indexed
and @IndexedField
annotations where needed (see Custom Fields Indexing with Protobuf).