Chapter 43. File2

File Component - Apache Camel 2.0 onwards
Copy link

The File component provides access to file systems, allowing files to be processed by any other Apache Camel Components or messages from other components to be saved to disk.

URI format
Copy link

file:directoryName[?options]

or

file://directoryName[?options]

Where directoryName represents the underlying file directory.

You can append query options to the URI in the following format, ?option=value&option=value&...

Note

Apache Camel only supports endpoints configured with a starting directory. So the directoryName must be a directory. If you want to consume a single file only, you can use the fileName option, e.g. by setting fileName=thefilename. Also, the starting directory must not contain dynamic expressions with ${ } placeholders. Again use the fileName option to specify the dynamic part of the filename.

Avoid reading files currently being written by another application

Beware the JDK File IO API is a bit limited in detecting whether another application is currently writing/copying a file. And the implementation can be different depending on OS platform as well. This could lead to that Apache Camel thinks the file is not locked by another process and start consuming it. Therefore you have to do you own investigation as to what suits your environment. To help with this, Apache Camel provides different readLock options and the doneFileOption option that you can use. See also the section the section called “Consuming files from folders where others drop files directly”.

URI Options
Copy link

Expand

Name	Default Value	Description
`autoCreate`	`true`	Automatically create missing directories in the file's pathname. For the file consumer, that means creating the starting directory. For the file producer, it means the directory where the files should be written.
`bufferSize`	128kb	Write buffer sized in bytes.
`fileName`	`null`	Use Expression such as File Language to dynamically set the filename. For consumers, it's used as a filename filter. For producers, it's used to evaluate the filename to write. If an expression is set, it take precedence over the `CamelFileName` header. (Note: The header itself can also be an Expression). The expression options support both `String` and `Expression` types. If the expression is a `String` type, it is always evaluated using the File Language. If the expression is an `Expression` type, the specified `Expression` type is used - this allows you, for instance, to use OGNL expressions. For the consumer, you can use it to filter filenames, so you can for instance consume today's file using the File Language syntax: `mydata-${date:now:yyyyMMdd}.txt`. From Camel 2.11 onwards the producers support the `CamelOverruleFileName` header which takes precedence over any existing `CamelFileName` header; the `CamelOverruleFileName` is a header that is used only once, and makes it easier as this avoids to temporary store `CamelFileName` and have to restore it afterwards.
`flatten`	`false`	Flatten is used to flatten the file name path to strip any leading paths, so it's just the file name. This allows you to consume recursively into sub-directories, but when you eg write the files to another directory they will be written in a single directory. Setting this to `true` on the producer enforces that any file name recived in `CamelFileName` header will be stripped for any leading paths.
`charset`	`null`	Camel 2.5: this option is used to specify the encoding of the file, and camel will set the Exchange property with Exchange.CHARSET_NAME with the value of this option.
`copyAndDeleteOnRenameFail`	`true`	Camel 2.9: whether to fallback and do a copy and delete file, in case the file could not be renamed directly. This option is not available for the FTP component.
`renameUsingCopy`	`false`	Camel 2.13.1: Perform rename operations using a copy and delete strategy. This is primarily used in environments where the regular rename operation is unreliable (e.g. across different file systems or networks). This option takes precedence over the `copyAndDeleteOnRenameFail` parameter that will automatically fall back to the copy and delete strategy, but only after additional delays.

Consumer only
Copy link

Expand

Name	Default Value	Description
`initialDelay`	`1000`	Milliseconds before polling the file/directory starts.
`delay`	`500`	Milliseconds before the next poll of the file/directory.
`useFixedDelay`	`true`	Set to `true` to use fixed delay between pools, otherwise fixed rate is used. See ScheduledExecutorService in JDK for details.
`runLoggingLevel`	`TRACE`	Camel 2.8: The consumer logs a start/complete log line when it polls. This option allows you to configure the logging level for that.
`recursive`	`false`	If a directory, will look for files in all the sub-directories as well.
`delete`	`false`	If `true`, the file will be deleted after it is processed successfully.
`noop`	`false`	If `true`, the file is not moved or deleted in any way. This option is good for readonly data, or for ETL type requirements. If `noop=true`, Apache Camel will set `idempotent=true` as well, to avoid consuming the same files over and over again.
`preMove`	`null`	Use Expression such as File Language to dynamically set the filename when moving it before processing. For example to move in-progress files into the `order` directory set this value to `order`.
`move`	`.camel`	Use Expression such as File Language to dynamically set the filename when moving it after processing. To move files into a `.done` subdirectory just enter `.done`.
`moveFailed`	`null`	Use Expression such as File Language to dynamically set the filename when moving failed files after processing. To move files into a `error` subdirectory just enter `error`. Note: When moving the files to another location it can/will handle the error when you move it to another location so Apache Camel cannot pick up the file again.
`include`	`null`	Is used to include files, if filename matches the regex pattern.
`exclude`	`null`	Is used to exclude files, if filename matches the regex pattern.
`antInclude`	`null`	Camel 2.10: Ant style filter inclusion, for example `antInclude=\{}/*{}.txt`. Multiple inclusions may be specified in comma-delimited format. See below for more details about ant path filters.
`antExclude`	`null`	Camel 2.10: Ant style filter exclusion. If both `antInclude` and `antExclude` are used, `antExclude` takes precedence over `antInclude`. Multiple exclusions may be specified in comma-delimited format. See below for more details about ant path filters.
`antFilterCaseSensitive`	`true`	Camel 2.11: Ant style filter which is case sensitive or not.
`idempotent`	`false`	Option to use the Idempotent Consumer EIP pattern to let Apache Camel skip already processed files. Will by default use a memory based LRUCache that holds 1000 entries. If `noop=true` then idempotent will be enabled as well to avoid consuming the same files over and over again.
`idempotentKey`	`Expression`	Camel 2.11: To use a custom idempotent key. By default the absolute path of the file is used. You can use the File Language, for example to use the file name and file size, you can do: idempotentKey=$-$ .
`idempotentRepository`	`null`	Pluggable repository as a org.apache.camel.processor.idempotent.MessageIdRepository class. Will by default use `MemoryMessageIdRepository` if none is specified and `idempotent` is `true`.
`inProgressRepository`	`memory`	Pluggable in-progress repository as a org.apache.camel.processor.idempotent.MessageIdRepository class. The in-progress repository is used to account the current in progress files being consumed. By default a memory based repository is used.
`filter`	`null`	Pluggable filter as a `org.apache.camel.component.file.GenericFileFilter` class. Will skip files if filter returns `false` in its `accept()` method. Apache Camel also ships with an ANT path matcher filter in the `camel-spring` component. More details in section below.
`sorter`	`null`	Pluggable sorter as a java.util.Comparator<org.apache.camel.component.file.GenericFile> class.
`sortBy`	`null`	Built-in sort using the File Language. Supports nested sorts, so you can have a sort by file name and as a 2nd group sort by modified date. See sorting section below for details.
`readLock`	`markerFile`	Used by consumer, to only poll the files if it has exclusive read-lock on the file (i.e. the file is not in-progress or being written). Apache Camel will wait until the file lock is granted. The `readLock` option supports the following built-in strategies: `changed` uses a length/modification timestamp to detect whether the file is currently being copied or not. Will wait at least 1 second to determine this, so this option cannot consume files as fast as the others, but can be more reliable as the JDK IO API cannot always determine whether a file is currently being used by another process. `fileLock` is for using `java.nio.channels.FileLock`. This approach should be avoided when accessing a remote file system via a mount/share unless that file system supports distributed file locks. `rename` attempts to rename the file, in order to test whether we can get an exclusive read-lock. `none` is for no read locks at all.
`readLockTimeout`	`0` (for FTP, `2000`)	Optional timeout in milliseconds for the read-lock, if supported by the read-lock. If the read-lock could not be granted and the timeout triggered, then Apache Camel will skip the file. At next poll Apache Camel, will try the file again, and this time maybe the read-lock could be granted. Currently `fileLock`, `changed` and `rename` support the timeout.
`readLockCheckInterval`	`1000` (for FTP, `5000`)	Camel 2.6: Interval in millis for the read-lock, if supported by the read lock. This interval is used for sleeping between attempts to acquire the read lock. For example when using the `changed` read lock, you can set a higher interval period to cater for slow writes. The default of 1 sec. may be too fast if the producer is very slow writing the file.
`readLockMinLength`	`1`	Camel 2.10.1: This option applied only for `readLock=changed`. This option allows you to configure a minimum file length. By default Camel expects the file to contain data, and thus the default value is 1. You can set this option to zero, to allow consuming zero-length files.
`readLockLoggingLevel`	`WARN`	Camel 2.12: Logging level used when a read lock could not be acquired. By default a WARN is logged. You can change this level, for example to OFF to not have any logging. This option is only applicable for `readLock` of types: `changed`, `fileLock`, `rename`.
`readLockMarkerFile`	`true`	Camel 2.14: Whether to use marker file with the changed, rename, or exclusive read lock types. By default a marker file is used as well to guard against other processes picking up the same files. This behavior can be turned off by setting this option to false. For example if you do not want to write marker files to the file systems by the Camel application.
`directoryMustExist`		Camel 2.5: Similar to `startingDirectoryMustExist` but this applies during polling recursive sub directories.
`doneFileName`	`null`	Camel 2.6: If provided, Camel will only consume files if a done file exists. This option configures what file name to use. Either you can specify a fixed name. Or you can use dynamic placeholders. The done file is always expected in the same folder as the original file. See using done file and writing done file sections for examples.
`exclusiveReadLockStrategy`	`null`	Pluggable read-lock as a `org.apache.camel.component.file.GenericFileExclusiveReadLockStrategy` implementation.
`maxMessagesPerPoll`	`0`	An integer that defines the maximum number of messages to gather per poll. By default, no maximum is set. Can be used to set a limit of e.g. 1000 to avoid having the server read thousands of files as it starts up. Set a value of 0 or negative to disabled it.
`eagerMaxMessagesPerPoll`	`true`	Camel 2.9.3: Allows for controlling whether the limit from `maxMessagesPerPoll` is eager or not. If eager then the limit is during the scanning of files. Where as false would scan all files, and then perform sorting. Setting this option to false allows for sorting all files first, and then limit the poll. Mind that this requires a higher memory usage as all file details are in memory to perform the sorting.
`minDepth`	0	Camel 2.8: The minimum depth to start processing when recursively processing a directory. Using `minDepth=1` means the base directory. Using `minDepth=2` means the first sub directory. This option is not supported by FTP consumer.
`maxDepth`	`Integer.MAX_VALUE`	Camel 2.8: The maximum depth to traverse when recursively processing a directory. This option is not supported by FTP consumer.
`processStrategy`	`null`	A pluggable `org.apache.camel.component.file.GenericFileProcessStrategy` allowing you to implement your own `readLock` option or similar. Can also be used when special conditions must be met before a file can be consumed, such as a special ready file exists. If this option is set then the `readLock` option does not apply.
`startingDirectoryMustExist`	`false`	Whether the starting directory must exist. Mind that the `autoCreate` option is default enabled, which means the starting directory is normally auto-created if it doesn't exist. You can disable `autoCreate` and enable this to ensure the starting directory must exist. Will throw an exception, if the directory doesn't exist.
`pollStrategy`	`null`	A pluggable `org.apache.camel.spi.PollingConsumerPollStrategy` allowing you to provide your custom implementation to control error handling usually occurred during the poll operation before an `Exchange` have been created and being routed in Camel. In other words, the error occurred while the polling was gathering information, for instance access to a file network failed so Camel cannot access it to scan for files. The default implementation will log the caused exception at `WARN` level and ignore it.
`sendEmptyMessageWhenIdle`	`false`	Camel 2.9: If the polling consumer did not poll any files, you can enable this option to send an empty message (no body) instead.
`consumer.bridgeErrorHandler`	`false`	Camel 2.10: Allows for bridging the consumer to the Camel routing Error Handler, which mean any exceptions occurred while trying to pick up files, or the like, will now be processed as a message and handled by the routing Error Handler. By default the consumer will use the `org.apache.camel.spi.ExceptionHandler` to deal with exceptions, that by default will be logged at `WARN`/`ERROR` level and ignored. See further below on this page fore more details, at section How to use the Camel error handler to deal with exceptions triggered outside the routing engine.
`scheduledExecutorService`	`null`	Camel 2.10: Allows for configuring a custom/shared thread pool to use for the consumer. By default each consumer has its own single threaded thread pool. This option allows you to share a thread pool among multiple file consumers.
`scheduler`	`null`	Camel 2.12: To use a custom scheduler to trigger the consumer to run. See more details at Polling Consumer, for example there is a Quartz2, and Spring based scheduler that supports CRON expressions.
`backoffMultiplier`	`0`	Camel 2.12: To let the scheduled polling consumer backoff if there has been a number of subsequent idles/errors in a row. The multiplier is then the number of polls that will be skipped before the next actual attempt is happening again. When this option is in use then `backoffIdleThreshold` and/or `backoffErrorThreshold` must also be configured. See more details at Polling Consumer.
`backoffIdleThreshold`	`0`	Camel 2.12: The number of subsequent idle polls that should happen before the `backoffMultipler` should kick-in.
`backoffErrorThreshold`	`0`	Camel 2.12: The number of subsequent error polls (failed due some error) that should happen before the `backoffMultipler` should kick-in.

Default behavior for file consumer
Copy link

By default the file is locked for the duration of the processing.
After the route has completed, files are moved into the .camel subdirectory, so that they appear to be deleted.
The File Consumer will always skip any file whose name starts with a dot, such as ., .camel, .m2 or .groovy.
Only files (not directories) are matched for valid filename, if options such as: includeNamePrefix, includeNamePostfix, excludeNamePrefix, excludeNamePostfix, regexPattern are used.

Producer only
Copy link

Expand

Name	Default Value	Description
`fileExist`	`Override`	What to do if a file already exists with the same name. The following values can be specified: Override, Append, Fail, Ignore, Move, and TryRename (Camel 2.11.1). `Override`, which is the default, replaces the existing file. `Append` adds content to the existing file. `Fail` throws a `GenericFileOperationException`, indicating that there is already an existing file. `Ignore` silently ignores the problem and does not override the existing file, but assumes everything is okay. The `Move` option requires Camel 2.10.1 onwards, and the corresponding `moveExisting` option to be configured as well. The option `eagerDeleteTargetFile` can be used to control what to do if an moving the file, and there exists already an existing file, otherwise causing the move operation to fail. The `Move` option will move any existing files, before writing the target file. `TryRename`Camel 2.11.1 is only applicable if `tempFileName` option is in use. This allows to try renaming the file from the temporary name to the actual name, without doing any exists check. This check may be faster on some file systems and especially FTP servers.
`tempPrefix`	`null`	This option is used to write the file using a temporary name and then, after the write is complete, rename it to the real name. Can be used to identify files being written and also avoid consumers (not using exclusive read locks) reading in progress files. Is often used by FTP when uploading big files.
`tempFileName`	`null`	Camel 2.1: The same as `tempPrefix` option but offering a more fine grained control on the naming of the temporary filename as it uses the File Language.
`keepLastModified`	`false`	Camel 2.2: Will keep the last modified timestamp from the source file (if any). Will use the `Exchange.FILE_LAST_MODIFIED` header to located the timestamp. This header can contain either a `java.util.Date` or `long` with the timestamp. If the timestamp exists and the option is enabled it will set this timestamp on the written file. Note: This option only applies to the file producer. You cannot use this option with any of the ftp producers.
`eagerDeleteTargetFile`	`true`	Camel 2.3: Whether or not to eagerly delete any existing target file. This option only applies when you use `fileExists=Override` and the `tempFileName` option as well. You can use this to disable (set it to false) deleting the target file before the temp file is written. For example you may write big files and want the target file to exists during the temp file is being written. This ensure the target file is only deleted until the very last moment, just before the temp file is being renamed to the target filename. From Camel 2.10.1 on this option is also used to control whether to delete any existing files when `fileExist=Move` is enabled, and an existing file exists. If the option `copyAndDeleteOnRenameFail` is `false`, an exception will be thrown if an existing file existed; if it's `true`, the existing file is deleted before the move operation.
`doneFileName`	`null`	Camel 2.6: If provided, then Camel will write a 2nd done file when the original file has been written. The done file will be empty. This option configures what file name to use. Either you can specify a fixed name. Or you can use dynamic placeholders. The done file will always be written in the same folder as the original file. See writing done file section for examples.
`allowNullBody`	`false`	Camel 2.10.1: Used to specify if a null body is allowed during file writing. If set to true then an empty file will be created, when set to false, and attempting to send a null body to the file component, a GenericFileWriteException of 'Cannot write null body to file.' will be thrown. If the `fileExist` option is set to 'Override', then the file will be truncated, and if set to `append` the file will remain unchanged.
`forceWrites`	`true`	Camel 2.10.5/2.11: Whether to force syncing writes to the file system. You can turn this off if you do not want this level of guarantee, for example if writing to logs / audit logs etc; this would yield better performance.

Default behavior for file producer
Copy link

By default it will override any existing file, if one exist with the same name. In Apache Camel 1.x the Append is the default for the file producer. We have changed this to Override in Apache Camel 2.0 as this is also the default file operation using java.io.File. And also the default for the FTP library we use in the camel-ftp component.

Move and Delete operations
Copy link

Any move or delete operations is executed after (post command) the routing has completed; so during processing of the Exchange the file is still located in the inbox folder.

Lets illustrate this with an example:

    from("file://inbox?move=.done").to("bean:handleOrder");

When a file is dropped in the inbox folder, the file consumer notices this and creates a new FileExchange that is routed to the handleOrder bean. The bean then processes the File object. At this point in time the file is still located in the inbox folder. After the bean completes, and thus the route is completed, the file consumer will perform the move operation and move the file to the .done sub-folder.

The move and preMove options is considered as a directory name (though if you use an expression such as File Language, or Simple then the result of the expression evaluation is the file name to be used - eg if you set

move=../backup/copy-of-${file:name}

then that's using the File Language which we use return the file name to be used), which can be either relative or absolute. If relative, the directory is created as a sub-folder from within the folder where the file was consumed.

By default, Apache Camel will move consumed files to the .camel sub-folder relative to the directory where the file was consumed.

If you want to delete the file after processing, the route should be:

    from("file://inobox?delete=true").to("bean:handleOrder");

We have introduced a pre move operation to move files before they are processed. This allows you to mark which files have been scanned as they are moved to this sub folder before being processed.

    from("file://inbox?preMove=inprogress").to("bean:handleOrder");

You can combine the pre move and the regular move:

    from("file://inbox?preMove=inprogress&move=.done").to("bean:handleOrder");

So in this situation, the file is in the inprogress folder when being processed and after it's processed, it's moved to the .done folder.

Fine grained control over Move and PreMove option
Copy link

The move and preMove option is Expression-based, so we have the full power of the File Language to do advanced configuration of the directory and name pattern. Apache Camel will, in fact, internally convert the directory name you enter into a File Language expression. So when we enter move=.done Apache Camel will convert this into: ${file:parent}/.done/${file:onlyname}. This is only done if Apache Camel detects that you have not provided a ${ } in the option value yourself. So when you enter an expression containing ${ }, the expression is interpreted as a File Language expression.

So if we want to move the file into a backup folder with today's date as the pattern, we can do:

move=backup/${date:now:yyyyMMdd}/${file:name}

About moveFailed
Copy link

The moveFailed option allows you to move files that could not be processed succesfully to another location such as a error folder of your choice. For example to move the files in an error folder with a timestamp you can use moveFailed=/error/${file:name.noext}-${date:now:yyyyMMddHHmmssSSS}.${file:name.ext}.

See more examples at File Language.

Message Headers
Copy link

The following headers are supported by this component:

File producer only
Copy link

Expand

Header	Description
`CamelFileName`	Specifies the name of the file to write (relative to the endpoint directory). The name can be a `String`; a `String` with a File Language or Simple expression; or an Expression object. If it's `null` then Apache Camel will auto-generate a filename based on the message unique ID.
`CamelFileNameProduced`	The actual absolute filepath (path + name) for the output file that was written. This header is set by Camel and its purpose is providing end-users with the name of the file that was written.
`CamelOverruleFileName`	Camel 2.11: Is used for overruling `CamelFileName` header and use the value instead (but only once, as the producer will remove this header after writing the file). The value can be only be a String. Notice that if the option `fileName` has been configured, then this is still being evaluated.

File consumer only
Copy link

Expand

Header	Description
`CamelFileName`	Name of the consumed file as a relative file path with offset from the starting directory configured on the endpoint.
`CamelFileNameOnly`	Only the file name (the name with no leading paths).
`CamelFileAbsolute`	A `boolean` option specifying whether the consumed file denotes an absolute path or not. Should normally be `false` for relative paths. Absolute paths should normally not be used but we added to the move option to allow moving files to absolute paths. But can be used elsewhere as well.
`CamelFileAbsolutePath`	The absolute path to the file. For relative files this path holds the relative path instead.
`CamelFilePath`	The file path. For relative files this is the starting directory + the relative filename. For absolute files this is the absolute path.
`CamelFileRelativePath`	The relative path.
`CamelFileParent`	The parent path.
`CamelFileLength`	A `long` value containing the file size.
`CamelFileLastModified`	A `long` value containing the last modified timestamp of the file.

Batch Consumer
Copy link

This component implements the Batch Consumer.

Exchange Properties, file consumer only
Copy link

As the file consumer is BatchConsumer it supports batching the files it polls. By batching it means that Apache Camel will add some properties to the Exchange so you know the number of files polled the current index in that order.

Expand

Property	Description
`CamelBatchSize`	The total number of files that was polled in this batch.
`CamelBatchIndex`	The current index of the batch. Starts from 0.
`CamelBatchComplete`	A `boolean` value indicating the last Exchange in the batch. Is only `true` for the last entry.

This allows you for instance to know how many files exists in this batch and for instance let the Aggregator aggregate this number of files.

Common gotchas with folder and filenames
Copy link

When Apache Camel is producing files (writing files) there are a few gotchas affecting how to set a filename of your choice. By default, Apache Camel will use the message ID as the filename, and since the message ID is normally a unique generated ID, you will end up with filenames such as: ID-MACHINENAME-2443-1211718892437-1-0. If such a filename is not desired, then you must provide a filename in the CamelFileName message header. The constant, Exchange.FILE_NAME, can also be used.

The sample code below produces files using the message ID as the filename:

from("direct:report").to("file:target/reports");

To use report.txt as the filename you have to do:

from("direct:report").setHeader(Exchange.FILE_NAME, constant("report.txt")).to( "file:target/reports");

Or the same as above, but with CamelFileName:

from("direct:report").setHeader("CamelFileName", constant("report.txt")).to( "file:target/reports");

And a syntax where we set the filename on the endpoint with the fileName URI option.

from("direct:report").to("file:target/reports/?fileName=report.txt");

Filename Expression
Copy link

Filename can be set either using the expression option or as a string-based File Language expression in the CamelFileName header. See the File Language for syntax and samples.

Consuming files from folders where others drop files directly
Copy link

Beware if you consume files from a folder where other applications write files directly. Take a look at the different readLock options to see what suits your use cases. The best approach is however to write to another folder and after the write move the file in the drop folder. However if you write files directly to the drop folder then the option changed could better detect whether a file is currently being written/copied as it uses a file changed algorithm to see whether the file size / modification changes over a period of time. The other read lock options rely on Java File API that sadly is not always very good at detecting this. You may also want to look at the doneFileName option, which uses a marker file (done) to signal when a file is done and ready to be consumed.

Using done files
Copy link

Available as of Camel 2.6

See also section writing done files below.

If you want only to consume files when a done file exists, then you can use the doneFileName option on the endpoint.

      from("file:bar?doneFileName=done");

Will only consume files from the bar folder, if a file name done exists in the same directory as the target files. Camel will automatically delete the done file when it's done consuming the files.

However its more common to have one done file per target file. This means there is a 1:1 correlation. To do this you must use dynamic placeholders in the doneFileName option. Currently Camel supports the following two dynamic tokens: file:name and file:name.noext which must be enclosed in ${ }. The consumer only supports the static part of the done file name as either prefix or suffix (not both).

      from("file:bar?doneFileName=${file:name}.done");

In this example only files will be polled if there exists a done file with the name file name.done. For example

hello.txt - is the file to be consumed
hello.txt.done - is the associated done file

You can also use a prefix for the done file, such as:

      from("file:bar?doneFileName=ready-${file:name}");

hello.txt - is the file to be consumed
ready-hello.txt - is the associated done file

Writing done files
Copy link

Available as of Camel 2.6

After you have written af file you may want to write an additional done file as a kinda of marker, to indicate to others that the file is finished and has been written. To do that you can use the doneFileName option on the file producer endpoint.

.to("file:bar?doneFileName=done");

Will simply create a file named done in the same directory as the target file.

However its more common to have one done file per target file. This means there is a 1:1 correlation. To do this you must use dynamic placeholders in the doneFileName option. Currently Camel supports the following two dynamic tokens: file:name and file:name.noext which must be enclosed in ${ }.

.to("file:bar?doneFileName=done-${file:name}");

Will for example create a file named done-foo.txt if the target file was foo.txt in the same directory as the target file.

.to("file:bar?doneFileName=${file:name}.done");

Will for example create a file named foo.txt.done if the target file was foo.txt in the same directory as the target file.

.to("file:bar?doneFileName=${file:name.noext}.done");

Will for example create a file named foo.done if the target file was foo.txt in the same directory as the target file.

Read from a directory and write to another directory
Copy link

from("file://inputdir/?delete=true").to("file://outputdir")

Read from a directory and write to another directory using a overrule dynamic name
Copy link

from("file://inputdir/?delete=true").to("file://outputdir?overruleFile=copy-of-${file:name}")

Listen on a directory and create a message for each file dropped there. Copy the contents to the outputdir and delete the file in the inputdir.

Reading recursively from a directory and writing to another
Copy link

from("file://inputdir/?recursive=true&delete=true").to("file://outputdir")

Listen on a directory and create a message for each file dropped there. Copy the contents to the outputdir and delete the file in the inputdir. Will scan recursively into sub-directories. Will lay out the files in the same directory structure in the outputdir as the inputdir, including any sub-directories.

inputdir/foo.txt
inputdir/sub/bar.txt

Will result in the following output layout:

outputdir/foo.txt
outputdir/sub/bar.txt

Using flatten
Copy link

If you want to store the files in the outputdir directory in the same directory, disregarding the source directory layout (e.g. to flatten out the path), you just add the flatten=true option on the file producer side:

from("file://inputdir/?recursive=true&delete=true").to("file://outputdir?flatten=true")

Will result in the following output layout:

outputdir/foo.txt
outputdir/bar.txt

Reading from a directory and the default move operation
Copy link

Apache Camel will by default move any processed file into a .camel subdirectory in the directory the file was consumed from.

from("file://inputdir/?recursive=true&delete=true").to("file://outputdir")

Affects the layout as follows: before

inputdir/foo.txt
inputdir/sub/bar.txt

after

inputdir/.camel/foo.txt
inputdir/sub/.camel/bar.txt
outputdir/foo.txt
outputdir/sub/bar.txt

Read from a directory and process the message in java
Copy link

from("file://inputdir/").process(new Processor() {
  public void process(Exchange exchange) throws Exception {
    Object body = exchange.getIn().getBody();
    // do some business logic with the input body
  }
});

The body will be a File object that points to the file that was just dropped into the inputdir directory.

Read files from a directory and send the content to a jms queue
Copy link

from("file://inputdir/").convertBodyTo(String.class).to("jms:test.queue")

By default the file endpoint sends a FileMessage which contains a File object as the body. If you send this directly to the JMS component the JMS message will only contain the File object but not the content. By converting the File to a String, the message will contain the file contents, which is probably what you want.

The route above using Spring DSL:

   <route>
      <from uri="file://inputdir/"/>
      <convertBodyTo type="java.lang.String"/>
      <to uri="jms:test.queue"/>
   </route>

Writing to files
Copy link

Apache Camel is of course also able to write files, i.e. produce files. In the sample below we receive some reports on the SEDA queue that we process before they are written to a directory.

public void testToFile() throws Exception {
    MockEndpoint mock = getMockEndpoint("mock:result");
    mock.expectedMessageCount(1);
    mock.expectedFileExists("target/test-reports/report.txt");

    template.sendBody("direct:reports", "This is a great report");

    assertMockEndpointsSatisfied();
}

protected JndiRegistry createRegistry() throws Exception {
    // bind our processor in the registry with the given id
    JndiRegistry reg = super.createRegistry();
    reg.bind("processReport", new ProcessReport());
    return reg;
}

protected RouteBuilder createRouteBuilder() throws Exception {
    return new RouteBuilder() {
        public void configure() throws Exception {
            // the reports from the seda queue is processed by our processor
            // before they are written to files in the target/reports directory
            from("direct:reports").processRef("processReport").to("file://target/test-reports", "mock:result");
        }
    };
}

private static class ProcessReport implements Processor {

    public void process(Exchange exchange) throws Exception {
        String body = exchange.getIn().getBody(String.class);
        // do some business logic here

        // set the output to the file
        exchange.getOut().setBody(body);

        // set the output filename using java code logic, notice that this is done by setting
        // a special header property of the out exchange
        exchange.getOut().setHeader(Exchange.FILE_NAME, "report.txt");
    }

}

Write to subdirectory using Exchange.FILE_NAME
Copy link

Using a single route, it is possible to write a file to any number of subdirectories. If you have a route setup as such:

  <route>
    <from uri="bean:myBean"/>
    <to uri="file:/rootDirectory"/>
  </route>

You can have myBean set the header Exchange.FILE_NAME to values such as:

Exchange.FILE_NAME = hello.txt => /rootDirectory/hello.txt
Exchange.FILE_NAME = foo/bye.txt => /rootDirectory/foo/bye.txt

This allows you to have a single route to write files to multiple destinations.

Writing file through the temporary directory relative to the final destination
Copy link

Sometime you need to temporarily write the files to some directory relative to the destination directory. Such situation usually happens when some external process with limited filtering capabilities is reading from the directory you are writing to. In the example below files will be written to the /var/myapp/filesInProgress directory and after data transfer is done, they will be atomically moved to the /var/myapp/finalDirectory directory.

from("direct:start").
  to("file:///var/myapp/finalDirectory?tempPrefix=/../filesInProgress/");

Using expression for filenames
Copy link

In this sample we want to move consumed files to a backup folder using today's date as a sub-folder name:

from("file://inbox?move=backup/${date:now:yyyyMMdd}/${file:name}").to("...");

See File Language for more samples.

Avoiding reading the same file more than once (idempotent consumer)
Copy link

Apache Camel supports Idempotent Consumer directly within the component so it will skip already processed files. This feature can be enabled by setting the idempotent=true option.

from("file://inbox?idempotent=true").to("...");

Camel uses the absolute file name as the idempotent key, to detect duplicate files. From Camel 2.11 onwards you can customize this key by using an expression in the idempotentKey option. For example to use both the name and the file size as the key

  <route>
    <from uri="file://inbox?idempotent=true&dempotentKey=${file:name}-${file:size}"/>
    <to uri="bean:processInbox"/>
  </route>

By default Apache Camel uses an in-memory based store for keeping track of consumed files, it uses a least recently used cache holding up to 1000 entries. You can plugin your own implementation of this store by using the idempotentRepository option using the # sign in the value to indicate it's a referring to a bean in the Registry with the specified id.

   <!-- define our store as a plain spring bean -->
   <bean id="myStore" class="com.mycompany.MyIdempotentStore"/>

  <route>
    <from uri="file://inbox?idempotent=true&dempotentRepository=#myStore"/>
    <to uri="bean:processInbox"/>
  </route>

Apache Camel will log at DEBUG level if it skips a file because it has been consumed before:

DEBUG FileConsumer is idempotent and the file has been consumed before. Will skip this file: target\idempotent\report.txt

Using a file based idempotent repository
Copy link

In this section we will use the file based idempotent repository org.apache.camel.processor.idempotent.FileIdempotentRepository instead of the in-memory based that is used as default. This repository uses a 1st level cache to avoid reading the file repository. It will only use the file repository to store the content of the 1st level cache. Thereby the repository can survive server restarts. It will load the content of the file into the 1st level cache upon startup. The file structure is very simple as it stores the key in separate lines in the file. By default, the file store has a size limit of 1mb and when the file grows larger, Apache Camel will truncate the file store and rebuild the content by flushing the 1st level cache into a fresh empty file.

We configure our repository using Spring XML creating our file idempotent repository and define our file consumer to use our repository with the idempotentRepository using \# sign to indicate Registry lookup:

<!-- this is our file based idempotent store configured to use the .filestore.dat as file -->
<bean id="fileStore" class="org.apache.camel.processor.idempotent.FileIdempotentRepository">
    <!-- the filename for the store -->
    <property name="fileStore" value="target/fileidempotent/.filestore.dat"/>
    <!-- the max filesize in bytes for the file. Apache Camel will trunk and flush the cache
         if the file gets bigger -->
    <property name="maxFileStoreSize" value="512000"/>
    <!-- the number of elements in our store -->
    <property name="cacheSize" value="250"/>
</bean>

<camelContext xmlns="http://camel.apache.org/schema/spring">
    <route>
        <from uri="file://target/fileidempotent/?idempotent=true&dempotentRepository=#fileStore&ove=done/${file:name}"/>
        <to uri="mock:result"/>
    </route>
</camelContext>

Using a JPA based idempotent repository
Copy link

In this section we will use the JPA based idempotent repository instead of the in-memory based that is used as default.

First we need a persistence-unit in META-INF/persistence.xml where we need to use the class org.apache.camel.processor.idempotent.jpa.MessageProcessed as model.

<persistence-unit name="idempotentDb" transaction-type="RESOURCE_LOCAL">
  <class>org.apache.camel.processor.idempotent.jpa.MessageProcessed</class>

  <properties>
    <property name="openjpa.ConnectionURL" value="jdbc:derby:target/idempotentTest;create=true"/>
    <property name="openjpa.ConnectionDriverName" value="org.apache.derby.jdbc.EmbeddedDriver"/>
    <property name="openjpa.jdbc.SynchronizeMappings" value="buildSchema"/>
    <property name="openjpa.Log" value="DefaultLevel=WARN, Tool=INFO"/>
  </properties>
</persistence-unit>

Then we need to setup a Spring jpaTemplate in the spring XML file:

<!-- this is standard spring JPA configuration -->
<bean id="jpaTemplate" class="org.springframework.orm.jpa.JpaTemplate">
    <property name="entityManagerFactory" ref="entityManagerFactory"/>
</bean>

<bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalEntityManagerFactoryBean">
    <!-- we use idempotentDB as the persitence unit name defined in the persistence.xml file -->
    <property name="persistenceUnitName" value="idempotentDb"/>
</bean>

And finally we can create our JPA idempotent repository in the spring XML file as well:

<!-- we define our jpa based idempotent repository we want to use in the file consumer -->
<bean id="jpaStore" class="org.apache.camel.processor.idempotent.jpa.JpaMessageIdRepository">
    <!-- Here we refer to the spring jpaTemplate -->
    <constructor-arg index="0" ref="jpaTemplate"/>
    <!-- This 2nd parameter is the name  (= a cateogry name).
         You can have different repositories with different names -->
    <constructor-arg index="1" value="FileConsumer"/>
</bean>

And then we just need to reference the jpaStore bean in the file consumer endpoint, using the idempotentRepository option and the # syntax:

  <route>
    <from uri="file://inbox?idempotent=true&dempotentRepository=#jpaStore"/>
    <to uri="bean:processInbox"/>
  </route>

Filter using org.apache.camel.component.file.GenericFileFilter
Copy link

Apache Camel supports pluggable filtering strategies. You can then configure the endpoint with such a filter to skip certain files being processed.

In the sample we have built our own filter that skips files starting with skip in the filename:

public class MyFileFilter implements GenericFileFilter {
    public boolean accept(GenericFile pathname) {
        // we dont accept any files starting with skip in the name
        return !pathname.getFileName().startsWith("skip");
    }
}

And then we can configure our route using the filter attribute to reference our filter (using # notation) that we have defined in the spring XML file:

   <!-- define our filter as a plain spring bean -->
   <bean id="myFilter" class="com.mycompany.MyFileFilter"/>

  <route>
    <from uri="file://inbox?filter=#myFilter"/>
    <to uri="bean:processInbox"/>
  </route>

Filtering using ANT path matcher
Copy link

The ANT path matcher is shipped out-of-the-box in the camel-spring jar. So you need to depend on camel-spring if you are using Maven. The reasons is that we leverage Spring's AntPathMatcher to do the actual matching.

The file paths is matched with the following rules:

? matches one character
* matches zero or more characters
** matches zero or more directories in a path

The sample below demonstrates how to use it:

<camelContext xmlns="http://camel.apache.org/schema/spring">
    <template id="camelTemplate"/>

    <!-- use myFilter as filter to allow setting ANT paths for which files to scan for -->
    <endpoint id="myFileEndpoint" uri="file://target/antpathmatcher?recursive=true&ilter=#myAntFilter"/>

    <route>
        <from ref="myFileEndpoint"/>
        <to uri="mock:result"/>
    </route>
</camelContext>

<!-- we use the antpath file filter to use ant paths for includes and exlucde -->
<bean id="myAntFilter" class="org.apache.camel.component.file.AntPathMatcherGenericFileFilter">
    <!-- include and file in the subfolder that has day in the name -->
    <property name="includes" value="**/subfolder/**/*day*"/>
    <!-- exclude all files with bad in name or .xml files. Use comma to seperate multiple excludes -->
    <property name="excludes" value="**/*bad*,**/*.xml"/>
</bean>

Sorting using Comparator
Copy link

Apache Camel supports pluggable sorting strategies. This strategy it to use the build in java.util.Comparator in Java. You can then configure the endpoint with such a comparator and have Apache Camel sort the files before being processed.

In the sample we have built our own comparator that just sorts by file name:

public class MyFileSorter implements Comparator<GenericFile> {
    public int compare(GenericFile o1, GenericFile o2) {
        return o1.getFileName().compareToIgnoreCase(o2.getFileName());
    }
}

And then we can configure our route using the sorter option to reference to our sorter (mySorter) we have defined in the spring XML file:

   <!-- define our sorter as a plain spring bean -->
   <bean id="mySorter" class="com.mycompany.MyFileSorter"/>

  <route>
    <from uri="file://inbox?sorter=#mySorter"/>
    <to uri="bean:processInbox"/>
  </route>

URI options can reference beans using the # syntax

In the Spring DSL route about notice that we can reference beans in the Registry by prefixing the id with #. So writing sorter=#mySorter, will instruct Apache Camel to go look in the Registry for a bean with the ID, mySorter.

Sorting using sortBy
Copy link

Apache Camel supports pluggable sorting strategies. This strategy it to use the File Language to configure the sorting. The sortBy option is configured as follows:

sortBy=group 1;group 2;group 3;...

Where each group is separated with semi colon. In the simple situations you just use one group, so a simple example could be:

sortBy=file:name

This will sort by file name, you can reverse the order by prefixing reverse: to the group, so the sorting is now Z..A:

sortBy=reverse:file:name

As we have the full power of File Language we can use some of the other parameters, so if we want to sort by file size we do:

sortBy=file:length

You can configure to ignore the case, using ignoreCase: for string comparison, so if you want to use file name sorting but to ignore the case then we do:

sortBy=ignoreCase:file:name

You can combine ignore case and reverse, however reverse must be specified first:

sortBy=reverse:ignoreCase:file:name

In the sample below we want to sort by last modified file, so we do:

sortBy=file:modified

And then we want to group by name as a 2nd option so files with same modifcation is sorted by name:

sortBy=file:modified;file:name

Now there is an issue here, can you spot it? Well the modified timestamp of the file is too fine as it will be in milliseconds, but what if we want to sort by date only and then subgroup by name? Well as we have the true power of File Language we can use the its date command that supports patterns. So this can be solved as:

sortBy=date:file:yyyyMMdd;file:name

Yeah, that is pretty powerful, oh by the way you can also use reverse per group, so we could reverse the file names:

sortBy=date:file:yyyyMMdd;reverse:file:name

Using GenericFileProcessStrategy
Copy link

The option processStrategy can be used to use a custom GenericFileProcessStrategy that allows you to implement your own begin, commit and rollback logic. For instance lets assume a system writes a file in a folder you should consume. But you should not start consuming the file before another ready file have been written as well.

So by implementing our own GenericFileProcessStrategy we can implement this as:

In the begin() method we can test whether the special ready file exists. The begin method returns a boolean to indicate if we can consume the file or not.
in the commit() method we can move the actual file and also delete the ready file.

Important when using consumer.bridgeErrorHandler

When using consumer.bridgeErrorHandler, then interceptors, OnCompletion does not apply. The Exchange is processed directly by the Camel Error Handler, and does not allow prior actions such as interceptors, onCompletion to take action.

Debug logging
Copy link

This component has log level TRACE that can be helpful if you have problems.

File Component - Apache Camel 2.0 onwardsCopy linkLink copied to clipboard!

URI formatCopy linkLink copied to clipboard!

URI OptionsCopy linkLink copied to clipboard!

Consumer onlyCopy linkLink copied to clipboard!

Default behavior for file consumerCopy linkLink copied to clipboard!

Producer onlyCopy linkLink copied to clipboard!

Default behavior for file producerCopy linkLink copied to clipboard!

Move and Delete operationsCopy linkLink copied to clipboard!

Fine grained control over Move and PreMove optionCopy linkLink copied to clipboard!

About moveFailedCopy linkLink copied to clipboard!

Message HeadersCopy linkLink copied to clipboard!

File producer onlyCopy linkLink copied to clipboard!

File consumer onlyCopy linkLink copied to clipboard!

Batch ConsumerCopy linkLink copied to clipboard!

Exchange Properties, file consumer onlyCopy linkLink copied to clipboard!

Common gotchas with folder and filenamesCopy linkLink copied to clipboard!

Filename ExpressionCopy linkLink copied to clipboard!

Consuming files from folders where others drop files directlyCopy linkLink copied to clipboard!

Using done filesCopy linkLink copied to clipboard!

Writing done filesCopy linkLink copied to clipboard!

Read from a directory and write to another directoryCopy linkLink copied to clipboard!

Read from a directory and write to another directory using a overrule dynamic nameCopy linkLink copied to clipboard!

Reading recursively from a directory and writing to anotherCopy linkLink copied to clipboard!

Using flattenCopy linkLink copied to clipboard!

Reading from a directory and the default move operationCopy linkLink copied to clipboard!

Read from a directory and process the message in java Copy linkLink copied to clipboard!

Read files from a directory and send the content to a jms queueCopy linkLink copied to clipboard!

Writing to filesCopy linkLink copied to clipboard!

Write to subdirectory using Exchange.FILE_NAME Copy linkLink copied to clipboard!

Writing file through the temporary directory relative to the final destinationCopy linkLink copied to clipboard!

Using expression for filenamesCopy linkLink copied to clipboard!

Avoiding reading the same file more than once (idempotent consumer)Copy linkLink copied to clipboard!

Using a file based idempotent repositoryCopy linkLink copied to clipboard!

Using a JPA based idempotent repositoryCopy linkLink copied to clipboard!

Filter using org.apache.camel.component.file.GenericFileFilterCopy linkLink copied to clipboard!

Filtering using ANT path matcherCopy linkLink copied to clipboard!

Sorting using ComparatorCopy linkLink copied to clipboard!

Sorting using sortByCopy linkLink copied to clipboard!

Using GenericFileProcessStrategyCopy linkLink copied to clipboard!

Debug loggingCopy linkLink copied to clipboard!

Learn

Try, buy, & sell

Communities

About Red Hat

Making open source more inclusive

About Red Hat Documentation

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

File Component - Apache Camel 2.0 onwards
Copy link

URI format
Copy link

URI Options
Copy link

Consumer only
Copy link

Default behavior for file consumer
Copy link

Producer only
Copy link

Default behavior for file producer
Copy link

Move and Delete operations
Copy link

Fine grained control over Move and PreMove option
Copy link

About moveFailed
Copy link

Message Headers
Copy link

File producer only
Copy link

File consumer only
Copy link

Batch Consumer
Copy link

Exchange Properties, file consumer only
Copy link

Common gotchas with folder and filenames
Copy link

Filename Expression
Copy link

Consuming files from folders where others drop files directly
Copy link

Using done files
Copy link

Writing done files
Copy link

Read from a directory and write to another directory
Copy link

Read from a directory and write to another directory using a overrule dynamic name
Copy link

Reading recursively from a directory and writing to another
Copy link

Using flatten
Copy link

Reading from a directory and the default move operation
Copy link

Read from a directory and process the message in java
Copy link

Read files from a directory and send the content to a jms queue
Copy link

Writing to files
Copy link

Write to subdirectory using Exchange.FILE_NAME
Copy link

Writing file through the temporary directory relative to the final destination
Copy link

Using expression for filenames
Copy link

Avoiding reading the same file more than once (idempotent consumer)
Copy link

Using a file based idempotent repository
Copy link

Using a JPA based idempotent repository
Copy link

Filter using org.apache.camel.component.file.GenericFileFilter
Copy link

Filtering using ANT path matcher
Copy link

Sorting using Comparator
Copy link

Sorting using sortBy
Copy link

Using GenericFileProcessStrategy
Copy link

Debug logging
Copy link