Chapter 10. Azure Storage Blob Service
Both producer and consumer are supported
The Azure Storage Blob component is used for storing and retrieving blobs from Azure Storage Blob Service using Azure APIs v12. However in case of versions above v12, we will see if this component can adopt these changes depending on how much breaking changes can result.
Prerequisites
You must have a valid Windows Azure Storage account. More information is available at Azure Documentation Portal .
Maven users will need to add the following dependency to their pom.xml
for this component:
<dependency> <groupId>org.apache.camel</groupId> <artifactId>camel-azure-storage-blob</artifactId> <version>{CamelSBVersion}</version> <!-- use the same version as your Camel core version --> </dependency>
10.1. URI Format
azure-storage-blob://accountName[/containerName][?options]
In case of consumer, accountName
, containerName
are required. In case of producer, it depends on the operation that being requested, for example if operation is on a container level, for example, createContainer
, accountName
and containerName
are only required, but in case of operation being requested in blob level, for example, getBlob
, accountName
, containerName
and blobName
are required.
The blob will be created if it does not already exist. You can append query options to the URI in the following format,
?options=value&option2=value&…
10.2. Configuring Options
Camel components are configured on two separate levels:
- component level
- endpoint level
10.2.1. Configuring Component Options
The component level is the highest level which holds general and common configurations that are inherited by the endpoints. For example a component may have security settings, credentials for authentication, urls for network connection and so forth.
Some components only have a few options, and others may have many. Because components typically have pre configured defaults that are commonly used, then you may often only need to configure a few options on a component; or none at all.
Configuring components can be done with the Component DSL, in a configuration file (application.properties|yaml), or directly with Java code.
10.2.2. Configuring Endpoint Options
Where you find yourself configuring the most is on endpoints, as endpoints often have many options, which allows you to configure what you need the endpoint to do. The options are also categorized into whether the endpoint is used as consumer (from) or as a producer (to), or used for both.
Configuring endpoints is most often done directly in the endpoint URI as path and query parameters. You can also use the Endpoint DSL as a type safe way of configuring endpoints.
A good practice when configuring options is to use Property Placeholders, which allows to not hardcode urls, port numbers, sensitive information, and other settings. In other words placeholders allows to externalize the configuration from your code, and gives more flexibility and reuse.
The following two sections lists all the options, firstly for the component followed by the endpoint.
10.3. Component Options
The Azure Storage Blob Service component supports 31 options, which are listed below.
Name | Description | Default | Type |
---|---|---|---|
blobName (common) | The blob name, to consume specific blob from a container. However on producer, is only required for the operations on the blob level. | String | |
blobOffset (common) | Set the blob offset for the upload or download operations, default is 0. | 0 | long |
blobType (common) | The blob type in order to initiate the appropriate settings for each blob type. Enum values:
| blockblob | BlobType |
closeStreamAfterRead (common) | Close the stream after read or keep it open, default is true. | true | boolean |
configuration (common) | The component configurations. | BlobConfiguration | |
credentials (common) | StorageSharedKeyCredential can be injected to create the azure client, this holds the important authentication information. | StorageSharedKeyCredential | |
dataCount (common) | How many bytes to include in the range. Must be greater than or equal to 0 if specified. | Long | |
fileDir (common) | The file directory where the downloaded blobs will be saved to, this can be used in both, producer and consumer. | String | |
maxResultsPerPage (common) | Specifies the maximum number of blobs to return, including all BlobPrefix elements. If the request does not specify maxResultsPerPage or specifies a value greater than 5,000, the server will return up to 5,000 items. | Integer | |
maxRetryRequests (common) | Specifies the maximum number of additional HTTP Get requests that will be made while reading the data from a response body. | 0 | int |
prefix (common) | Filters the results to return only blobs whose names begin with the specified prefix. May be null to return all blobs. | String | |
regex (common) | Filters the results to return only blobs whose names match the specified regular expression. May be null to return all if both prefix and regex are set, regex takes the priority and prefix is ignored. | String | |
serviceClient (common) | Autowired Client to a storage account. This client does not hold any state about a particular storage account but is instead a convenient way of sending off appropriate requests to the resource on the service. It may also be used to construct URLs to blobs and containers. This client contains operations on a service account. Operations on a container are available on BlobContainerClient through BlobServiceClient#getBlobContainerClient(String), and operations on a blob are available on BlobClient through BlobContainerClient#getBlobClient(String). | BlobServiceClient | |
timeout (common) | An optional timeout value beyond which a RuntimeException will be raised. | Duration | |
bridgeErrorHandler (consumer) | Allows for bridging the consumer to the Camel routing Error Handler, which mean any exceptions occurred while the consumer is trying to pickup incoming messages, or the likes, will now be processed as a message and handled by the routing Error Handler. By default the consumer will use the org.apache.camel.spi.ExceptionHandler to deal with exceptions, that will be logged at WARN or ERROR level and ignored. | false | boolean |
blobSequenceNumber (producer) | A user-controlled value that you can use to track requests. The value of the sequence number must be between 0 and 263 - 1.The default value is 0. | 0 | Long |
blockListType (producer) | Specifies which type of blocks to return. Enum values:
| COMMITTED | BlockListType |
changeFeedContext (producer) | When using getChangeFeed producer operation, this gives additional context that is passed through the Http pipeline during the service call. | Context | |
changeFeedEndTime (producer) | When using getChangeFeed producer operation, this filters the results to return events approximately before the end time. Note: A few events belonging to the next hour can also be returned. A few events belonging to this hour can be missing; to ensure all events from the hour are returned, round the end time up by an hour. | OffsetDateTime | |
changeFeedStartTime (producer) | When using getChangeFeed producer operation, this filters the results to return events approximately after the start time. Note: A few events belonging to the previous hour can also be returned. A few events belonging to this hour can be missing; to ensure all events from the hour are returned, round the start time down by an hour. | OffsetDateTime | |
closeStreamAfterWrite (producer) | Close the stream after write or keep it open, default is true. | true | boolean |
commitBlockListLater (producer) | When is set to true, the staged blocks will not be committed directly. | true | boolean |
createAppendBlob (producer) | When is set to true, the append blocks will be created when committing append blocks. | true | boolean |
createPageBlob (producer) | When is set to true, the page blob will be created when uploading page blob. | true | boolean |
downloadLinkExpiration (producer) | Override the default expiration (millis) of URL download link. | Long | |
lazyStartProducer (producer) | Whether the producer should be started lazy (on the first message). By starting lazy you can use this to allow CamelContext and routes to startup in situations where a producer may otherwise fail during starting and cause the route to fail being started. By deferring this startup to be lazy then the startup failure can be handled during routing messages via Camel’s routing error handlers. Beware that when the first message is processed then creating and starting the producer may take a little time and prolong the total processing time of the processing. | false | boolean |
operation (producer) | The blob operation that can be used with this component on the producer. Enum values:
| listBlobContainers | BlobOperationsDefinition |
pageBlobSize (producer) | Specifies the maximum size for the page blob, up to 8 TB. The page blob size must be aligned to a 512-byte boundary. | 512 | Long |
autowiredEnabled (advanced) | Whether autowiring is enabled. This is used for automatic autowiring options (the option must be marked as autowired) by looking up in the registry to find if there is a single instance of matching type, which then gets configured on the component. This can be used for automatic configuring JDBC data sources, JMS connection factories, AWS Clients, etc. | true | boolean |
accessKey (security) | Access key for the associated azure account name to be used for authentication with azure blob services. | String | |
sourceBlobAccessKey (security) | Source Blob Access Key: for copyblob operation, sadly, we need to have an accessKey for the source blob we want to copy Passing an accessKey as header, it’s unsafe so we could set as key. | String |
10.4. Endpoint Options
The Azure Storage Blob Service endpoint is configured using URI syntax:
azure-storage-blob:accountName/containerName
with the following path and query parameters:
10.4.1. Path Parameters (2 parameters)
Name | Description | Default | Type |
---|---|---|---|
accountName (common) | Azure account name to be used for authentication with azure blob services. | String | |
containerName (common) | The blob container name. | String |
10.4.2. Query Parameters (48 parameters)
Name | Description | Default | Type |
---|---|---|---|
blobName (common) | The blob name, to consume specific blob from a container. However on producer, is only required for the operations on the blob level. | String | |
blobOffset (common) | Set the blob offset for the upload or download operations, default is 0. | 0 | long |
blobServiceClient (common) | Client to a storage account. This client does not hold any state about a particular storage account but is instead a convenient way of sending off appropriate requests to the resource on the service. It may also be used to construct URLs to blobs and containers. This client contains operations on a service account. Operations on a container are available on BlobContainerClient through getBlobContainerClient(String), and operations on a blob are available on BlobClient through getBlobContainerClient(String).getBlobClient(String). | BlobServiceClient | |
blobType (common) | The blob type in order to initiate the appropriate settings for each blob type. Enum values:
| blockblob | BlobType |
closeStreamAfterRead (common) | Close the stream after read or keep it open, default is true. | true | boolean |
credentials (common) | StorageSharedKeyCredential can be injected to create the azure client, this holds the important authentication information. | StorageSharedKeyCredential | |
dataCount (common) | How many bytes to include in the range. Must be greater than or equal to 0 if specified. | Long | |
fileDir (common) | The file directory where the downloaded blobs will be saved to, this can be used in both, producer and consumer. | String | |
maxResultsPerPage (common) | Specifies the maximum number of blobs to return, including all BlobPrefix elements. If the request does not specify maxResultsPerPage or specifies a value greater than 5,000, the server will return up to 5,000 items. | Integer | |
maxRetryRequests (common) | Specifies the maximum number of additional HTTP Get requests that will be made while reading the data from a response body. | 0 | int |
prefix (common) | Filters the results to return only blobs whose names begin with the specified prefix. May be null to return all blobs. | String | |
regex (common) | Filters the results to return only blobs whose names match the specified regular expression. May be null to return all if both prefix and regex are set, regex takes the priority and prefix is ignored. | String | |
serviceClient (common) | Autowired Client to a storage account. This client does not hold any state about a particular storage account but is instead a convenient way of sending off appropriate requests to the resource on the service. It may also be used to construct URLs to blobs and containers. This client contains operations on a service account. Operations on a container are available on BlobContainerClient through BlobServiceClient#getBlobContainerClient(String), and operations on a blob are available on BlobClient through BlobContainerClient#getBlobClient(String). | BlobServiceClient | |
timeout (common) | An optional timeout value beyond which a RuntimeException will be raised. | Duration | |
bridgeErrorHandler (consumer) | Allows for bridging the consumer to the Camel routing Error Handler, which mean any exceptions occurred while the consumer is trying to pickup incoming messages, or the likes, will now be processed as a message and handled by the routing Error Handler. By default the consumer will use the org.apache.camel.spi.ExceptionHandler to deal with exceptions, that will be logged at WARN or ERROR level and ignored. | false | boolean |
sendEmptyMessageWhenIdle (consumer) | If the polling consumer did not poll any files, you can enable this option to send an empty message (no body) instead. | false | boolean |
exceptionHandler (consumer (advanced)) | To let the consumer use a custom ExceptionHandler. Notice if the option bridgeErrorHandler is enabled then this option is not in use. By default the consumer will deal with exceptions, that will be logged at WARN or ERROR level and ignored. | ExceptionHandler | |
exchangePattern (consumer (advanced)) | Sets the exchange pattern when the consumer creates an exchange. Enum values:
| ExchangePattern | |
pollStrategy (consumer (advanced)) | A pluggable org.apache.camel.PollingConsumerPollingStrategy allowing you to provide your custom implementation to control error handling usually occurred during the poll operation before an Exchange have been created and being routed in Camel. | PollingConsumerPollStrategy | |
blobSequenceNumber (producer) | A user-controlled value that you can use to track requests. The value of the sequence number must be between 0 and 263 - 1.The default value is 0. | 0 | Long |
blockListType (producer) | Specifies which type of blocks to return. Enum values:
| COMMITTED | BlockListType |
changeFeedContext (producer) | When using getChangeFeed producer operation, this gives additional context that is passed through the Http pipeline during the service call. | Context | |
changeFeedEndTime (producer) | When using getChangeFeed producer operation, this filters the results to return events approximately before the end time. Note: A few events belonging to the next hour can also be returned. A few events belonging to this hour can be missing; to ensure all events from the hour are returned, round the end time up by an hour. | OffsetDateTime | |
changeFeedStartTime (producer) | When using getChangeFeed producer operation, this filters the results to return events approximately after the start time. Note: A few events belonging to the previous hour can also be returned. A few events belonging to this hour can be missing; to ensure all events from the hour are returned, round the start time down by an hour. | OffsetDateTime | |
closeStreamAfterWrite (producer) | Close the stream after write or keep it open, default is true. | true | boolean |
commitBlockListLater (producer) | When is set to true, the staged blocks will not be committed directly. | true | boolean |
createAppendBlob (producer) | When is set to true, the append blocks will be created when committing append blocks. | true | boolean |
createPageBlob (producer) | When is set to true, the page blob will be created when uploading page blob. | true | boolean |
downloadLinkExpiration (producer) | Override the default expiration (millis) of URL download link. | Long | |
lazyStartProducer (producer) | Whether the producer should be started lazy (on the first message). By starting lazy you can use this to allow CamelContext and routes to startup in situations where a producer may otherwise fail during starting and cause the route to fail being started. By deferring this startup to be lazy then the startup failure can be handled during routing messages via Camel’s routing error handlers. Beware that when the first message is processed then creating and starting the producer may take a little time and prolong the total processing time of the processing. | false | boolean |
operation (producer) | The blob operation that can be used with this component on the producer. Enum values:
| listBlobContainers | BlobOperationsDefinition |
pageBlobSize (producer) | Specifies the maximum size for the page blob, up to 8 TB. The page blob size must be aligned to a 512-byte boundary. | 512 | Long |
backoffErrorThreshold (scheduler) | The number of subsequent error polls (failed due some error) that should happen before the backoffMultipler should kick-in. | int | |
backoffIdleThreshold (scheduler) | The number of subsequent idle polls that should happen before the backoffMultipler should kick-in. | int | |
backoffMultiplier (scheduler) | To let the scheduled polling consumer backoff if there has been a number of subsequent idles/errors in a row. The multiplier is then the number of polls that will be skipped before the next actual attempt is happening again. When this option is in use then backoffIdleThreshold and/or backoffErrorThreshold must also be configured. | int | |
delay (scheduler) | Milliseconds before the next poll. | 500 | long |
greedy (scheduler) | If greedy is enabled, then the ScheduledPollConsumer will run immediately again, if the previous run polled 1 or more messages. | false | boolean |
initialDelay (scheduler) | Milliseconds before the first poll starts. | 1000 | long |
repeatCount (scheduler) | Specifies a maximum limit of number of fires. So if you set it to 1, the scheduler will only fire once. If you set it to 5, it will only fire five times. A value of zero or negative means fire forever. | 0 | long |
runLoggingLevel (scheduler) | The consumer logs a start/complete log line when it polls. This option allows you to configure the logging level for that. Enum values:
| TRACE | LoggingLevel |
scheduledExecutorService (scheduler) | Allows for configuring a custom/shared thread pool to use for the consumer. By default each consumer has its own single threaded thread pool. | ScheduledExecutorService | |
scheduler (scheduler) | To use a cron scheduler from either camel-spring or camel-quartz component. Use value spring or quartz for built in scheduler. | none | Object |
schedulerProperties (scheduler) | To configure additional properties when using a custom scheduler or any of the Quartz, Spring based scheduler. | Map | |
startScheduler (scheduler) | Whether the scheduler should be auto started. | true | boolean |
timeUnit (scheduler) | Time unit for initialDelay and delay options. Enum values:
| MILLISECONDS | TimeUnit |
useFixedDelay (scheduler) | Controls if fixed delay or fixed rate is used. See ScheduledExecutorService in JDK for details. | true | boolean |
accessKey (security) | Access key for the associated azure account name to be used for authentication with azure blob services. | String | |
sourceBlobAccessKey (security) | Source Blob Access Key: for copyblob operation, sadly, we need to have an accessKey for the source blob we want to copy Passing an accessKey as header, it’s unsafe so we could set as key. | String |
Required information options
To use this component, you have 3 options in order to provide the required Azure authentication information:
-
Provide
accountName
andaccessKey
for your Azure account, this is the simplest way to get started. The accessKey can be generated through your Azure portal. -
Provide a StorageSharedKeyCredential instance which can be provided into
credentials
option. -
Provide a BlobServiceClient instance which can be provided into
blobServiceClient
. Note: You don’t need to create a specific client, e.g: BlockBlobClient, the BlobServiceClient represents the upper level which can be used to retrieve lower level clients.
10.5. Usage
For example, in order to download a blob content from the block blob hello.txt
located on the container1
in the camelazure
storage account, use the following snippet:
from("azure-storage-blob://camelazure/container1?blobName=hello.txt&accessKey=yourAccessKey"). to("file://blobdirectory");
10.5.1. Message headers evaluated by the component producer
Header | Variable Name | Type | Operations | Description |
---|---|---|---|---|
|
|
| All | An optional timeout value beyond which a {@link RuntimeException} will be raised. |
|
|
| Operations related to container and blob | Metadata to associate with the container or blob. |
|
|
|
|
Specifies how the data in this container is available to the public. Pass |
|
|
| Operations related to container and blob | This contains values which will restrict the successful operation of a variety of requests to the conditions present. These conditions are entirely optional. |
|
|
|
| The details for listing specific blobs |
|
|
|
| Filters the results to return only blobs whose names begin with the specified prefix. May be null to return all blobs. |
|
|
|
| Specifies the maximum number of blobs to return, including all BlobPrefix elements. If the request does not specify maxResultsPerPage or specifies a value greater than 5,000, the server will return up to 5,000 items. |
|
|
|
| Defines options available to configure the behavior of a call to listBlobsFlatSegment on a {@link BlobContainerClient} object. |
|
|
|
| Additional parameters for a set of operations. |
|
|
|
| Defines values for AccessTier. |
|
|
| Most operations related to upload blob | An MD5 hash of the block content. This hash is used to verify the integrity of the block during transport. When this header is specified, the storage service compares the hash of the content that has arrived with this header value. Note that this MD5 hash is not stored with the blob. If the two hashes do not match, the operation will fail. |
|
|
| Operations related to page blob | A {@link PageRange} object. Given that pages must be aligned with 512-byte boundaries, the start offset must be a modulus of 512 and the end offset must be a modulus of 512 - 1. Examples of valid byte ranges are 0-511, 512-1023, etc. |
|
|
|
|
When is set to |
|
|
|
|
When is set to |
|
|
|
|
When is set to |
|
|
|
| Specifies which type of blocks to return. |
|
|
|
| Specifies the maximum size for the page blob, up to 8 TB. The page blob size must be aligned to a 512-byte boundary. |
|
|
|
| A user-controlled value that you can use to track requests. The value of the sequence number must be between 0 and 2^63 - 1.The default value is 0. |
|
|
|
| Specifies the behavior for deleting the snapshots on this blob. \{@code Include} will delete the base blob and all snapshots. \{@code Only} will delete only the snapshots. If a snapshot is being deleted, you must pass null. |
|
|
|
| A {@link ListBlobContainersOptions} which specifies what data should be returned by the service. |
|
|
|
| {@link ParallelTransferOptions} to use to download to file. Number of parallel transfers parameter is ignored. |
|
|
|
| The file directory where the downloaded blobs will be saved to. |
|
|
|
| Override the default expiration (millis) of URL download link. |
|
|
| Operations related to blob | Override/set the blob name on the exchange headers. |
|
|
| Operations related to container and blob | Override/set the container name on the exchange headers. |
|
|
| All | Specify the producer operation to execute, please see the doc on this page related to producer operation. |
|
|
|
| Filters the results to return only blobs whose names match the specified regular expression. May be null to return all. If both prefix and regex are set, regex takes the priority and prefix is ignored. |
|
|
|
| It filters the results to return events approximately after the start time. Note: A few events belonging to the previous hour can also be returned. A few events belonging to this hour can be missing; to ensure all events from the hour are returned, round the start time down by an hour. |
|
|
|
| It filters the results to return events approximately before the end time. Note: A few events belonging to the next hour can also be returned. A few events belonging to this hour can be missing; to ensure all events from the hour are returned, round the end time up by an hour. |
|
|
|
| This gives additional context that is passed through the Http pipeline during the service call. |
|
|
|
| The source blob account name to be used as source account name in a copy blob operation |
|
|
|
| The source blob container name to be used as source container name in a copy blob operation |
10.5.2. Message headers set by either component producer or consumer
Header | Variable Name | Type | Description |
---|---|---|---|
|
|
| Access tier of the blob. |
|
|
| Datetime when the access tier of the blob last changed. |
|
|
| Archive status of the blob. |
|
|
| Creation time of the blob. |
|
|
| The current sequence number for a page blob. |
|
|
| The size of the blob. |
|
|
| The type of the blob. |
|
|
| Cache control specified for the blob. |
|
|
| Number of blocks committed to an append blob |
|
|
| Content disposition specified for the blob. |
|
|
| Content encoding specified for the blob. |
|
|
| Content language specified for the blob. |
|
|
| Content MD5 specified for the blob. |
|
|
| Content type specified for the blob. |
|
|
| Datetime when the last copy operation on the blob completed. |
|
|
| Snapshot identifier of the last incremental copy snapshot for the blob. |
|
|
| Identifier of the last copy operation performed on the blob. |
|
|
| Progress of the last copy operation performed on the blob. |
|
|
| Source of the last copy operation performed on the blob. |
|
|
| Status of the last copy operation performed on the blob. |
|
|
| Description of the last copy operation on the blob. |
|
|
| The E Tag of the blob |
|
|
| Flag indicating if the access tier of the blob was inferred from properties of the blob. |
|
|
| Flag indicating if the blob was incrementally copied. |
|
|
| Flag indicating if the blob’s content is encrypted on the server. |
|
|
| Datetime when the blob was last modified. |
|
|
| Type of lease on the blob. |
|
|
| State of the lease on the blob. |
|
|
| Status of the lease on the blob. |
|
|
| Additional metadata associated with the blob. |
|
|
| The offset at which the block was committed to the block blob. |
|
|
|
The downloaded filename from the operation |
|
|
|
The download link generated by |
|
|
| Returns non-parsed httpHeaders that can be used by the user. |
10.5.3. Advanced Azure Storage Blob configuration
If your Camel Application is running behind a firewall or if you need to have more control over the BlobServiceClient
instance configuration, you can create your own instance:
StorageSharedKeyCredential credential = new StorageSharedKeyCredential("yourAccountName", "yourAccessKey"); String uri = String.format("https://%s.blob.core.windows.net", "yourAccountName"); BlobServiceClient client = new BlobServiceClientBuilder() .endpoint(uri) .credential(credential) .buildClient(); // This is camel context context.getRegistry().bind("client", client);
Then refer to this instance in your Camel azure-storage-blob
component configuration:
from("azure-storage-blob://cameldev/container1?blobName=myblob&serviceClient=#client") .to("mock:result");
10.5.4. Automatic detection of BlobServiceClient client in registry
The component is capable of detecting the presence of an BlobServiceClient bean into the registry. If it’s the only instance of that type it will be used as client and you won’t have to define it as uri parameter, like the example above. This may be really useful for smarter configuration of the endpoint.
10.5.5. Azure Storage Blob Producer operations
Camel Azure Storage Blob component provides wide range of operations on the producer side:
Operations on the service level
For these operations, accountName
is required.
Operation | Description |
---|---|
| Get the content of the blob. You can restrict the output of this operation to a blob range. |
| Returns transaction logs of all the changes that occur to the blobs and the blob metadata in your storage account. The change feed provides ordered, guaranteed, durable, immutable, read-only log of these changes. |
Operations on the container level
For these operations, accountName
and containerName
are required.
Operation | Description |
---|---|
| Creates a new container within a storage account. If a container with the same name already exists, the producer will ignore it. |
| Deletes the specified container in the storage account. If the container doesn’t exist the operation fails. |
| Returns a list of blobs in this container, with folder structures flattened. |
Operations on the blob level
For these operations, accountName
, containerName
and blobName
are required.
Operation | Blob Type | Description |
---|---|---|
| Common | Get the content of the blob. You can restrict the output of this operation to a blob range. |
| Common | Delete a blob. |
| Common | Downloads the entire blob into a file specified by the path.The file will be created and must not exist, if the file already exists a {@link FileAlreadyExistsException} will be thrown. |
| Common | Generates the download link for the specified blob using shared access signatures (SAS). This by default only limit to 1hour of allowed access. However, you can override the default expiration duration through the headers. |
| BlockBlob | Creates a new block blob, or updates the content of an existing block blob. Updating an existing block blob overwrites any existing metadata on the blob. Partial updates are not supported with PutBlob; the content of the existing blob is overwritten with the new content. |
|
|
Uploads the specified block to the block blob’s "staging area" to be later committed by a call to commitBlobBlockList. However in case header |
|
|
Writes a blob by specifying the list of block IDs that are to make up the blob. In order to be written as part of a blob, a block must have been successfully written to the server in a prior |
|
| Returns the list of blocks that have been uploaded as part of a block blob using the specified block list filter. |
|
| Creates a 0-length append blob. Call commitAppendBlo`b operation to append data to an append blob. |
|
|
Commits a new block of data to the end of the existing append blob. In case of header |
|
|
Creates a page blob of the specified length. Call |
|
|
Writes one or more pages to the page blob. The write size must be a multiple of 512. In case of header |
|
| Resizes the page blob to the specified size (which must be a multiple of 512). |
|
| Frees the specified pages from the page blob. The size of the range must be a multiple of 512. |
|
| Returns the list of valid page ranges for a page blob or snapshot of a page blob. |
|
| Copy a blob from one container to another one, even from different accounts. |
Refer to the example section in this page to learn how to use these operations into your camel application.
10.5.6. Consumer Examples
To consume a blob into a file using file component, this can be done like this:
from("azure-storage-blob://camelazure/container1?blobName=hello.txt&accountName=yourAccountName&accessKey=yourAccessKey"). to("file://blobdirectory");
However, you can also write to file directly without using the file component, you will need to specify fileDir
folder path in order to save your blob in your machine.
from("azure-storage-blob://camelazure/container1?blobName=hello.txt&accountName=yourAccountName&accessKey=yourAccessKey&fileDir=/var/to/awesome/dir"). to("mock:results");
Also, the component supports batch consumer, hence you can consume multiple blobs with only specifying the container name, the consumer will return multiple exchanges depending on the number of the blobs in the container.
Example
from("azure-storage-blob://camelazure/container1?accountName=yourAccountName&accessKey=yourAccessKey&fileDir=/var/to/awesome/dir"). to("mock:results");
10.5.7. Producer Operations Examples
-
listBlobContainers
from("direct:start") .process(exchange -> { // set the header you want the producer to evaluate, refer to the previous // section to learn about the headers that can be set // e.g: exchange.getIn().setHeader(BlobConstants.LIST_BLOB_CONTAINERS_OPTIONS, new ListBlobContainersOptions().setMaxResultsPerPage(10)); }) .to("azure-storage-blob://camelazure?operation=listBlobContainers&client&serviceClient=#client") .to("mock:result");
-
createBlobContainer
from("direct:start") .process(exchange -> { // set the header you want the producer to evaluate, refer to the previous // section to learn about the headers that can be set // e.g: exchange.getIn().setHeader(BlobConstants.BLOB_CONTAINER_NAME, "newContainerName"); }) .to("azure-storage-blob://camelazure/container1?operation=createBlobContainer&serviceClient=#client") .to("mock:result");
-
deleteBlobContainer
:
from("direct:start") .process(exchange -> { // set the header you want the producer to evaluate, refer to the previous // section to learn about the headers that can be set // e.g: exchange.getIn().setHeader(BlobConstants.BLOB_CONTAINER_NAME, "overridenName"); }) .to("azure-storage-blob://camelazure/container1?operation=deleteBlobContainer&serviceClient=#client") .to("mock:result");
-
listBlobs
:
from("direct:start") .process(exchange -> { // set the header you want the producer to evaluate, refer to the previous // section to learn about the headers that can be set // e.g: exchange.getIn().setHeader(BlobConstants.BLOB_CONTAINER_NAME, "overridenName"); }) .to("azure-storage-blob://camelazure/container1?operation=listBlobs&serviceClient=#client") .to("mock:result");
-
getBlob
:
We can either set an outputStream
in the exchange body and write the data to it. E.g:
from("direct:start") .process(exchange -> { // set the header you want the producer to evaluate, refer to the previous // section to learn about the headers that can be set // e.g: exchange.getIn().setHeader(BlobConstants.BLOB_CONTAINER_NAME, "overridenName"); // set our body exchange.getIn().setBody(outputStream); }) .to("azure-storage-blob://camelazure/container1?blobName=blob&operation=getBlob&serviceClient=#client") .to("mock:result");
If we don’t set a body, then this operation will give us an InputStream
instance which can proceeded further downstream:
from("direct:start") .to("azure-storage-blob://camelazure/container1?blobName=blob&operation=getBlob&serviceClient=#client") .process(exchange -> { InputStream inputStream = exchange.getMessage().getBody(InputStream.class); // We use Apache common IO for simplicity, but you are free to do whatever dealing // with inputStream System.out.println(IOUtils.toString(inputStream, StandardCharsets.UTF_8.name())); }) .to("mock:result");
-
deleteBlob
:
from("direct:start") .process(exchange -> { // set the header you want the producer to evaluate, refer to the previous // section to learn about the headers that can be set // e.g: exchange.getIn().setHeader(BlobConstants.BLOB_NAME, "overridenName"); }) .to("azure-storage-blob://camelazure/container1?blobName=blob&operation=deleteBlob&serviceClient=#client") .to("mock:result");
-
downloadBlobToFile
:
from("direct:start") .process(exchange -> { // set the header you want the producer to evaluate, refer to the previous // section to learn about the headers that can be set // e.g: exchange.getIn().setHeader(BlobConstants.BLOB_NAME, "overridenName"); }) .to("azure-storage-blob://camelazure/container1?blobName=blob&operation=downloadBlobToFile&fileDir=/var/mydir&serviceClient=#client") .to("mock:result");
-
downloadLink
from("direct:start") .to("azure-storage-blob://camelazure/container1?blobName=blob&operation=downloadLink&serviceClient=#client") .process(exchange -> { String link = exchange.getMessage().getHeader(BlobConstants.DOWNLOAD_LINK, String.class); System.out.println("My link " + link); }) .to("mock:result");
-
uploadBlockBlob
from("direct:start") .process(exchange -> { // set the header you want the producer to evaluate, refer to the previous // section to learn about the headers that can be set // e.g: exchange.getIn().setHeader(BlobConstants.BLOB_NAME, "overridenName"); exchange.getIn().setBody("Block Blob"); }) .to("azure-storage-blob://camelazure/container1?blobName=blob&operation=uploadBlockBlob&serviceClient=#client") .to("mock:result");
-
stageBlockBlobList
from("direct:start") .process(exchange -> { final List<BlobBlock> blocks = new LinkedList<>(); blocks.add(BlobBlock.createBlobBlock(new ByteArrayInputStream("Hello".getBytes()))); blocks.add(BlobBlock.createBlobBlock(new ByteArrayInputStream("From".getBytes()))); blocks.add(BlobBlock.createBlobBlock(new ByteArrayInputStream("Camel".getBytes()))); exchange.getIn().setBody(blocks); }) .to("azure-storage-blob://camelazure/container1?blobName=blob&operation=stageBlockBlobList&serviceClient=#client") .to("mock:result");
-
commitBlockBlobList
from("direct:start") .process(exchange -> { // We assume here you have the knowledge of these blocks you want to commit final List<Block> blocksIds = new LinkedList<>(); blocksIds.add(new Block().setName("id-1")); blocksIds.add(new Block().setName("id-2")); blocksIds.add(new Block().setName("id-3")); exchange.getIn().setBody(blocksIds); }) .to("azure-storage-blob://camelazure/container1?blobName=blob&operation=commitBlockBlobList&serviceClient=#client") .to("mock:result");
-
getBlobBlockList
from("direct:start") .to("azure-storage-blob://camelazure/container1?blobName=blob&operation=getBlobBlockList&serviceClient=#client") .log("${body}") .to("mock:result");
-
createAppendBlob
from("direct:start") .to("azure-storage-blob://camelazure/container1?blobName=blob&operation=createAppendBlob&serviceClient=#client") .to("mock:result");
-
commitAppendBlob
from("direct:start") .process(exchange -> { final String data = "Hello world from my awesome tests!"; final InputStream dataStream = new ByteArrayInputStream(data.getBytes(StandardCharsets.UTF_8)); exchange.getIn().setBody(dataStream); // of course you can set whatever headers you like, refer to the headers section to learn more }) .to("azure-storage-blob://camelazure/container1?blobName=blob&operation=commitAppendBlob&serviceClient=#client") .to("mock:result");
-
createPageBlob
from("direct:start") .to("azure-storage-blob://camelazure/container1?blobName=blob&operation=createPageBlob&serviceClient=#client") .to("mock:result");
-
uploadPageBlob
from("direct:start") .process(exchange -> { byte[] dataBytes = new byte[512]; // we set range for the page from 0-511 new Random().nextBytes(dataBytes); final InputStream dataStream = new ByteArrayInputStream(dataBytes); final PageRange pageRange = new PageRange().setStart(0).setEnd(511); exchange.getIn().setHeader(BlobConstants.PAGE_BLOB_RANGE, pageRange); exchange.getIn().setBody(dataStream); }) .to("azure-storage-blob://camelazure/container1?blobName=blob&operation=uploadPageBlob&serviceClient=#client") .to("mock:result");
-
resizePageBlob
from("direct:start") .process(exchange -> { final PageRange pageRange = new PageRange().setStart(0).setEnd(511); exchange.getIn().setHeader(BlobConstants.PAGE_BLOB_RANGE, pageRange); }) .to("azure-storage-blob://camelazure/container1?blobName=blob&operation=resizePageBlob&serviceClient=#client") .to("mock:result");
-
clearPageBlob
from("direct:start") .process(exchange -> { final PageRange pageRange = new PageRange().setStart(0).setEnd(511); exchange.getIn().setHeader(BlobConstants.PAGE_BLOB_RANGE, pageRange); }) .to("azure-storage-blob://camelazure/container1?blobName=blob&operation=clearPageBlob&serviceClient=#client") .to("mock:result");
-
getPageBlobRanges
from("direct:start") .process(exchange -> { final PageRange pageRange = new PageRange().setStart(0).setEnd(511); exchange.getIn().setHeader(BlobConstants.PAGE_BLOB_RANGE, pageRange); }) .to("azure-storage-blob://camelazure/container1?blobName=blob&operation=getPageBlobRanges&serviceClient=#client") .log("${body}") .to("mock:result");
-
copyBlob
from("direct:copyBlob") .process(exchange -> { exchange.getIn().setHeader(BlobConstants.BLOB_NAME, "file.txt"); exchange.getMessage().setHeader(BlobConstants.SOURCE_BLOB_CONTAINER_NAME, "containerblob1"); exchange.getMessage().setHeader(BlobConstants.SOURCE_BLOB_ACCOUNT_NAME, "account"); }) .to("azure-storage-blob://account/containerblob2?operation=copyBlob&sourceBlobAccessKey=RAW(accessKey)") .to("mock:result");
In this way the file.txt in the container containerblob1 of the account 'account', will be copied to the container containerblob2 of the same account.
10.5.8. Development Notes (Important)
All integration tests use Testcontainers and run by default. Obtaining of Azure accessKey and accountName is needed to be able to run all integration tests using Azure services. In addition to the mocked unit tests you will need to run the integration tests with every change you make or even client upgrade as the Azure client can break things even on minor versions upgrade. To run the integration tests, on this component directory, run the following maven command:
mvn verify -PfullTests -DaccountName=myacc -DaccessKey=mykey
Whereby accountName
is your Azure account name and accessKey
is the access key being generated from Azure portal.
10.6. Spring Boot Auto-Configuration
When using azure-storage-blob with Spring Boot make sure to use the following Maven dependency to have support for auto configuration:
<dependency> <groupId>org.apache.camel.springboot</groupId> <artifactId>camel-azure-storage-blob-starter</artifactId> </dependency>
The component supports 32 options, which are listed below.
Name | Description | Default | Type |
---|---|---|---|
camel.component.azure-storage-blob.access-key | Access key for the associated azure account name to be used for authentication with azure blob services. | String | |
camel.component.azure-storage-blob.autowired-enabled | Whether autowiring is enabled. This is used for automatic autowiring options (the option must be marked as autowired) by looking up in the registry to find if there is a single instance of matching type, which then gets configured on the component. This can be used for automatic configuring JDBC data sources, JMS connection factories, AWS Clients, etc. | true | Boolean |
camel.component.azure-storage-blob.blob-name | The blob name, to consume specific blob from a container. However on producer, is only required for the operations on the blob level. | String | |
camel.component.azure-storage-blob.blob-offset | Set the blob offset for the upload or download operations, default is 0. | 0 | Long |
camel.component.azure-storage-blob.blob-sequence-number | A user-controlled value that you can use to track requests. The value of the sequence number must be between 0 and 263 - 1.The default value is 0. | 0 | Long |
camel.component.azure-storage-blob.blob-type | The blob type in order to initiate the appropriate settings for each blob type. | BlobType | |
camel.component.azure-storage-blob.block-list-type | Specifies which type of blocks to return. | BlockListType | |
camel.component.azure-storage-blob.bridge-error-handler | Allows for bridging the consumer to the Camel routing Error Handler, which mean any exceptions occurred while the consumer is trying to pickup incoming messages, or the likes, will now be processed as a message and handled by the routing Error Handler. By default the consumer will use the org.apache.camel.spi.ExceptionHandler to deal with exceptions, that will be logged at WARN or ERROR level and ignored. | false | Boolean |
camel.component.azure-storage-blob.change-feed-context | When using getChangeFeed producer operation, this gives additional context that is passed through the Http pipeline during the service call. The option is a com.azure.core.util.Context type. | Context | |
camel.component.azure-storage-blob.change-feed-end-time | When using getChangeFeed producer operation, this filters the results to return events approximately before the end time. Note: A few events belonging to the next hour can also be returned. A few events belonging to this hour can be missing; to ensure all events from the hour are returned, round the end time up by an hour. The option is a java.time.OffsetDateTime type. | OffsetDateTime | |
camel.component.azure-storage-blob.change-feed-start-time | When using getChangeFeed producer operation, this filters the results to return events approximately after the start time. Note: A few events belonging to the previous hour can also be returned. A few events belonging to this hour can be missing; to ensure all events from the hour are returned, round the start time down by an hour. The option is a java.time.OffsetDateTime type. | OffsetDateTime | |
camel.component.azure-storage-blob.close-stream-after-read | Close the stream after read or keep it open, default is true. | true | Boolean |
camel.component.azure-storage-blob.close-stream-after-write | Close the stream after write or keep it open, default is true. | true | Boolean |
camel.component.azure-storage-blob.commit-block-list-later | When is set to true, the staged blocks will not be committed directly. | true | Boolean |
camel.component.azure-storage-blob.configuration | The component configurations. The option is a org.apache.camel.component.azure.storage.blob.BlobConfiguration type. | BlobConfiguration | |
camel.component.azure-storage-blob.create-append-blob | When is set to true, the append blocks will be created when committing append blocks. | true | Boolean |
camel.component.azure-storage-blob.create-page-blob | When is set to true, the page blob will be created when uploading page blob. | true | Boolean |
camel.component.azure-storage-blob.credentials | StorageSharedKeyCredential can be injected to create the azure client, this holds the important authentication information. The option is a com.azure.storage.common.StorageSharedKeyCredential type. | StorageSharedKeyCredential | |
camel.component.azure-storage-blob.data-count | How many bytes to include in the range. Must be greater than or equal to 0 if specified. | Long | |
camel.component.azure-storage-blob.download-link-expiration | Override the default expiration (millis) of URL download link. | Long | |
camel.component.azure-storage-blob.enabled | Whether to enable auto configuration of the azure-storage-blob component. This is enabled by default. | Boolean | |
camel.component.azure-storage-blob.file-dir | The file directory where the downloaded blobs will be saved to, this can be used in both, producer and consumer. | String | |
camel.component.azure-storage-blob.lazy-start-producer | Whether the producer should be started lazy (on the first message). By starting lazy you can use this to allow CamelContext and routes to startup in situations where a producer may otherwise fail during starting and cause the route to fail being started. By deferring this startup to be lazy then the startup failure can be handled during routing messages via Camel’s routing error handlers. Beware that when the first message is processed then creating and starting the producer may take a little time and prolong the total processing time of the processing. | false | Boolean |
camel.component.azure-storage-blob.max-results-per-page | Specifies the maximum number of blobs to return, including all BlobPrefix elements. If the request does not specify maxResultsPerPage or specifies a value greater than 5,000, the server will return up to 5,000 items. | Integer | |
camel.component.azure-storage-blob.max-retry-requests | Specifies the maximum number of additional HTTP Get requests that will be made while reading the data from a response body. | 0 | Integer |
camel.component.azure-storage-blob.operation | The blob operation that can be used with this component on the producer. | BlobOperationsDefinition | |
camel.component.azure-storage-blob.page-blob-size | Specifies the maximum size for the page blob, up to 8 TB. The page blob size must be aligned to a 512-byte boundary. | 512 | Long |
camel.component.azure-storage-blob.prefix | Filters the results to return only blobs whose names begin with the specified prefix. May be null to return all blobs. | String | |
camel.component.azure-storage-blob.regex | Filters the results to return only blobs whose names match the specified regular expression. May be null to return all if both prefix and regex are set, regex takes the priority and prefix is ignored. | String | |
camel.component.azure-storage-blob.service-client | Client to a storage account. This client does not hold any state about a particular storage account but is instead a convenient way of sending off appropriate requests to the resource on the service. It may also be used to construct URLs to blobs and containers. This client contains operations on a service account. Operations on a container are available on BlobContainerClient through BlobServiceClient#getBlobContainerClient(String), and operations on a blob are available on BlobClient through BlobContainerClient#getBlobClient(String). The option is a com.azure.storage.blob.BlobServiceClient type. | BlobServiceClient | |
camel.component.azure-storage-blob.source-blob-access-key | Source Blob Access Key: for copyblob operation, sadly, we need to have an accessKey for the source blob we want to copy Passing an accessKey as header, it’s unsafe so we could set as key. | String | |
camel.component.azure-storage-blob.timeout | An optional timeout value beyond which a RuntimeException will be raised. The option is a java.time.Duration type. | Duration |