Chapter 109. MongoDB GridFS
Camel MongoDB GridFS component
Available as of Camel 2.17
Maven users will need to add the following dependency to their
pom.xml
for this component:
<dependency> <groupId>org.apache.camel</groupId> <artifactId>camel-mongodb-gridfs</artifactId> <version>2.17.0.redhat-630xxx</version> <!-- use the same version as your Camel core version --> </dependency>
URI format
Camel versions 2.19 and higher:
mongodb-gridfs:connectionBean?database=databaseName&bucket=bucketName[&moreOptions...]
Camel versions 2.17 and 2.18:
gridfs:connectionBean?database=databaseName&bucket=bucketName[&moreOptions...]
Endpoint options
GridFS endpoints support the following options, depending on whether they are acting as a Producer or as a Consumer. (For consumers, options also vary based on consumer type.)
Name | Default Value | Description | Producer | Consumer |
---|---|---|---|---|
database
|
none | Required. Specifies the name of the database to which to bind this endpoint. All operations will be executed against this database. | Y | Y |
bucket
|
fs
|
Specifies the name of the GridFS bucket within the specified database. Defaults to the GridFS.DEFAULT_BUCKET value.
|
Y | Y |
operation
|
create
|
Specifies the ID of the operation this endpoint will execute. Valid values are:
|
Y | N |
query
|
none |
Used in conjunction with queryStrategy options to create the query used to search for new files.
|
N | Y |
queryStrategy
|
TimeStamp
|
Specifies the strategy used to find new files. Valid values are:
|
N | Y |
persistentTSCollection
|
camel-timestamps
|
Used in conjunction with PersistentTimestamp . Specifies the collection in which the timestamp is stored.
|
N | Y |
persistentTSObject
|
camel-timestamp
|
Used in conjunction with
PersistentTimestamp . Specifies the ID of the timestamp object.
This allows each consumer to have its own timestamp ID stored in a common collection.
|
N | Y |
fileAttributeName
|
camel-processed
|
Used in conjunction with
FileAttribute . Specifies the name of the attribute to use.
When a file is about to be processed, the specified attribute is set to
processing . When file processing has finished, the specified attribute is set to done .
|
N | Y |
delay
|
500 (ms)
|
Specifies the interval, in milliseconds, between subsequent polls of GridFS for new files. | N | Y |
initialDelay
|
1000 (ms)
|
Specifies the delay, in milliseconds, before polling GridFS the first time for new files.
|
N | Y |
Configuration of database in Spring XML
The following Spring XML creates a bean defining the connection to a MongoDB instance.
<beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd"> <bean id="mongoBean" class="com.mongodb.Mongo"> <constructor-arg name="host" value="${mongodb.host}" /> <constructor-arg name="port" value="${mongodb.port}" /> </bean> </beans>
Sample route
The following route defined in Spring XML executes the operation
findOne
on a collection.
<route> <from uri="direct:start" /> <!-- using bean 'mongoBean' defined above --> <to uri="mongodb-gridfs:mongoBean?database=${mongodb.database}&operation=findOne" /> <to uri="direct:result" /> </route>
MongoDB operations - producer endpoints
count
Returns the total number of files in the collection, and returns an integer as the OUT message body:// from("direct:count").to("mongodb-gridfs?database=tickets&operation=count"); Integer result = template.requestBodyAndHeader("direct:count", "irrelevantBody"); assertTrue("Result is not of type Long", result instanceof Integer);
You can use a filename header to provide a count of files matching the specified file name:Map<String, Object> headers = new HashMap<String, Object>(); headers.put(Exchange.FILE_NAME, "filename.txt"); Integer count = template.requestBodyAndHeaders("direct:count", query, headers);
listAll
Returns a Reader that lists all of the file names and their IDs in a tab-separated stream:// from("direct:listAll").to("mongodb-gridfs?database=tickets&operation=listAll"); Reader result = template.requestBodyAndHeader("direct:listAll", irrelevantBody"); filename1.txt 1252314321 filename2.txt 2897651254
findOne
UsingExchange.FILE_NAME
from incoming headers, finds a matching file in the GridFS system, sets the body to anInputStream
of the content, and provides metadata as headers:// from("direct:findOne").to("mongodb-gridfs?database=tickets&operation=findOne"); Map<String, Object> headers = new HashMap<String, Object>(); headers.put(Exchange.FILE_NAME, "filename.txt"); InputStream result = template.requestBodyAndHeaders("direct:findOne", "irrelevantBody", headers);
create
Creates a new file in the GridFS database, usingExchange.FILE_NAME
from the incoming headers for the file name and the body content as anInputStream
for the file contents:// from("direct:create").to("mongodb-gridfs?database=tickets&operation=create"); Map<String, Object> headers = new HashMap<String, Object>(); headers.put(Exchange.FILE_NAME, "filename.txt"); InputStream result = ...the data for the file... template.requestBodyAndHeaders("direct:create", stream, headers);
remove
Removes a file from the GridFS database:// from("direct:remove").to("mongodb-gridfs?database=tickets&operation=remove"); Map<String, Object> headers = new HashMap<String, Object>(); headers.put(Exchange.FILE_NAME, "filename.txt"); template.requestBodyAndHeaders("direct:remove", "", headers);
GridFS Consumer
The MongoDB GridFS component polls GridFS periodically for new files to process. Two parameters,
delay
and initialDelay
, control this behavior. delay
specifies how long the background thread sleeps between polling attempts (default is 500
ms). initialDelay
specifies how long after starting the consumer waits before polling GridFS the first time, which is useful when the backend service needs a little more time to become available.
Several strategies are available to the consumer for determining which files within the grid have not yet been processed:
TimeStamp
—[default] On start up, the consumer uses the current time as the starting point. Only files added after the consumer started are processed. All files in the grid that pre-date consumer startup are ignored. After polling, the consumer updates its timestamp with the timestamp of the most recently processed file.PersistentTimestamp
—On start up, the consumer queries the collection specified bypersistentTSCollection
for the object provided bypersistentTSObject
and uses it as the starting timestamp. If that object does not exist, the consumer uses the current time and creates the object. Whenever a file has been processed, the timestamp in the collection is updated.FileAttribute
—Instead of using timestamps, the consumer queries GridFS for files that lack the attribute specified byfileAttributeName
. When the consumer starts to process the file, this attribute is added to the file in GridFS.Usage example:from("mongodb-gridfs?database=tickets&queryStrategy=FileAttribute").process(...);
TimestampAndFileAttribute
—Combing the two strategies, the consumer finds files newer thanTimeStamp
that lack the attribute provided byfileAttributeName
. During file processing the missing attribute is added to the file in GridFS.PersistentTimestampAndFileAttribute
—Combing the two strategies, the consumer finds files newer thanTimeStamp
that lack the attribute provided byfileAttributeName
. During file processing the missing attribute is added to the file in GridFS, and the timestamp in the collection is updated.Usage example:from("mongodb-gridfs?database=myData&queryStrategy=PersistentTimestamp& persistentTSCollection=CamelTimestamps&persistentTSObject=myDataTS).process(...);
See also
- Unit tests for more examples of usage