Chapter 360. uniVocity TSV DataFormat
Available as of Camel version 2.15
This Data Format uses uniVocity-parsers for reading and writing 3 kinds of tabular data text files:
- CSV (Comma Separated Values), where the values are separated by a symbol (usually a comma)
- fixed-width, where the values have known sizes
- TSV (Tabular Separated Values), where the fields are separated by a tabulation
Thus there are 3 data formats based on uniVocity-parsers.
If you use Maven you can just add the following to your pom.xml, substituting the version number for the latest and greatest release (see the download page for the latest versions).
<dependency> <groupId>org.apache.camel</groupId> <artifactId>camel-univocity-parsers</artifactId> <version>x.x.x</version> </dependency>
360.1. Options
Most configuration options of the uniVocity-parsers are available in the data formats. If you want more information about a particular option, please refer to their documentation page.
The 3 data formats share common options and have dedicated ones, this section presents them all.
360.2. Options
The uniVocity TSV dataformat supports 15 options, which are listed below.
Name | Default | Java Type | Description |
---|---|---|---|
escapeChar |
|
| The escape character. |
nullValue |
| The string representation of a null value. The default value is null | |
skipEmptyLines |
|
| Whether or not the empty lines must be ignored. The default value is true |
ignoreTrailingWhitespaces |
|
| Whether or not the trailing white spaces must ignored. The default value is true |
ignoreLeadingWhitespaces |
|
| Whether or not the leading white spaces must be ignored. The default value is true |
headersDisabled |
|
| Whether or not the headers are disabled. When defined, this option explicitly sets the headers as null which indicates that there is no header. The default value is false |
headerExtractionEnabled |
|
| Whether or not the header must be read in the first line of the test document The default value is false |
numberOfRecordsToRead |
| The maximum number of record to read. | |
emptyValue |
| The String representation of an empty value | |
lineSeparator |
| The line separator of the files The default value is to use the JVM platform line separator | |
normalizedLineSeparator |
| The normalized line separator of the files The default value is a new line character. | |
comment |
|
| The comment symbol. The default value is # |
lazyLoad |
|
| Whether the unmarshalling should produce an iterator that reads the lines on the fly or if all the lines must be read at one. The default value is false |
asMap |
|
| Whether the unmarshalling should produce maps for the lines values instead of lists. It requires to have header (either defined or collected). The default value is false |
contentTypeHeader |
|
| Whether the data format should set the Content-Type header with the type from the data format if the data format is capable of doing so. For example application/xml for data formats marshalling to XML, or application/json for data formats marshalling to JSon etc. |
360.3. Spring Boot Auto-Configuration
The component supports 16 options, which are listed below.
Name | Description | Default | Type |
---|---|---|---|
camel.dataformat.univocity-tsv.as-map | Whether the unmarshalling should produce maps for the lines values instead of lists. It requires to have header (either defined or collected). The default value is false | false | Boolean |
camel.dataformat.univocity-tsv.comment | The comment symbol. The default value is # | # | String |
camel.dataformat.univocity-tsv.content-type-header | Whether the data format should set the Content-Type header with the type from the data format if the data format is capable of doing so. For example application/xml for data formats marshalling to XML, or application/json for data formats marshalling to JSon etc. | false | Boolean |
camel.dataformat.univocity-tsv.empty-value | The String representation of an empty value | String | |
camel.dataformat.univocity-tsv.enabled | Enable univocity-tsv dataformat | true | Boolean |
camel.dataformat.univocity-tsv.escape-char | The escape character. | \ | String |
camel.dataformat.univocity-tsv.header-extraction-enabled | Whether or not the header must be read in the first line of the test document The default value is false | false | Boolean |
camel.dataformat.univocity-tsv.headers-disabled | Whether or not the headers are disabled. When defined, this option explicitly sets the headers as null which indicates that there is no header. The default value is false | false | Boolean |
camel.dataformat.univocity-tsv.ignore-leading-whitespaces | Whether or not the leading white spaces must be ignored. The default value is true | true | Boolean |
camel.dataformat.univocity-tsv.ignore-trailing-whitespaces | Whether or not the trailing white spaces must ignored. The default value is true | true | Boolean |
camel.dataformat.univocity-tsv.lazy-load | Whether the unmarshalling should produce an iterator that reads the lines on the fly or if all the lines must be read at one. The default value is false | false | Boolean |
camel.dataformat.univocity-tsv.line-separator | The line separator of the files The default value is to use the JVM platform line separator | String | |
camel.dataformat.univocity-tsv.normalized-line-separator | The normalized line separator of the files The default value is a new line character. | String | |
camel.dataformat.univocity-tsv.null-value | The string representation of a null value. The default value is null | String | |
camel.dataformat.univocity-tsv.number-of-records-to-read | The maximum number of record to read. | Integer | |
camel.dataformat.univocity-tsv.skip-empty-lines | Whether or not the empty lines must be ignored. The default value is true | true | Boolean |
360.4. Marshalling usages
The marshalling accepts either:
- A list of maps (L`ist<Map<String, ?>>`), one for each line
-
A single map (
Map<String, ?>
), for a single line
Any other body will throws an exception.
360.4.1. Usage example: marshalling a Map into CSV format
<route> <from uri="direct:input"/> <marshal> <univocity-csv/> </marshal> <to uri="mock:result"/> </route>
360.4.2. Usage example: marshalling a Map into fixed-width format
<route> <from uri="direct:input"/> <marshal> <univocity-fixed padding="_"> <univocity-header length="5"/> <univocity-header length="5"/> <univocity-header length="5"/> </univocity-fixed> </marshal> <to uri="mock:result"/> </route>
360.4.3. Usage example: marshalling a Map into TSV format
<route> <from uri="direct:input"/> <marshal> <univocity-tsv/> </marshal> <to uri="mock:result"/> </route>
360.5. Unmarshalling usages
The unmarshalling uses an InputStream
in order to read the data.
Each row produces either:
-
a list with all the values in it (
asMap
option withfalse
); -
A map with all the values indexed by the headers (
asMap
option withtrue
).
All the rows can either:
-
be collected at once into a list (
lazyLoad
option withfalse
); -
be read on the fly using an iterator (
lazyLoad
option withtrue
).
360.5.1. Usage example: unmarshalling a CSV format into maps with automatic headers
<route> <from uri="direct:input"/> <unmarshal> <univocity-csv headerExtractionEnabled="true" asMap="true"/> </unmarshal> <to uri="mock:result"/> </route>
360.5.2. Usage example: unmarshalling a fixed-width format into lists
<route> <from uri="direct:input"/> <unmarshal> <univocity-fixed> <univocity-header length="5"/> <univocity-header length="5"/> <univocity-header length="5"/> </univocity-fixed> </unmarshal> <to uri="mock:result"/> </route>