Azure Blob Storage
Schema-driven source documentation.
AZURE_BLOB_STORAGE42 fields1 examples
Commonly Asked Questions
Assistant knowledge mapped to this source type from
assistant_knowledge.json.Required
Fields required for a valid configuration payload under `config.required`.
| Path | Type | Required | Description | Default | Constraints |
|---|---|---|---|---|---|
| required | object | Yes | — | — | no extra properties |
| required.account_url | string | Yes | Azure Blob account URL (for example, https://<account>.blob.core.windows.net) | — | format uri |
| required.container | string | Yes | Azure Blob container name | — | — |
Masked
Sensitive fields under `config.masked` (secrets/credentials).
| Path | Type | Required | Description | Default | Constraints |
|---|---|---|---|---|---|
| masked | object | No | Optional Azure credentials. Leave empty to use managed identity/default credential chain. | — | no extra properties |
| masked.azure_account_key | string | No | Azure storage account key | — | — |
| masked.azure_client_id | string | No | Azure Entra client ID (service principal auth) | — | — |
| masked.azure_client_secret | string | No | Azure Entra client secret (service principal auth) | — | — |
| masked.azure_connection_string | string | No | Azure storage connection string (takes precedence over other auth fields) | — | — |
| masked.azure_sas_token | string | No | Azure SAS token | — | — |
| masked.azure_tenant_id | string | No | Azure Entra tenant ID (service principal auth) | — | — |
Optional
Optional configuration fields under `config.optional`.
| Path | Type | Required | Description | Default | Constraints |
|---|---|---|---|---|---|
| optional | object | No | — | — | no extra properties |
| optional.connection | object | No | — | — | no extra properties |
| optional.connection.max_keys_per_page | integer | No | Maximum blobs requested per list page | 200 | min 1, max 1000 |
| optional.connection.max_object_bytes | integer | No | Maximum bytes downloaded per blob for MIME detection and text extraction | 5242880 | min 1024, max 52428800 |
| optional.connection.request_timeout_seconds | number | No | Network timeout in seconds for list/download operations | 30 | min 1, max 300 |
| optional.scope | object | No | Object scope and filtering controls. | — | no extra properties |
| optional.scope.exclude_extensions | array | No | Optional extension denylist | — | — |
| optional.scope.exclude_extensions[] | string | No | — | — | — |
| optional.scope.include_content_preview | boolean | No | Download object bytes to infer MIME and extract detector-ready text previews | true | — |
| optional.scope.include_empty_objects | boolean | No | Include zero-byte objects in extraction results | false | — |
| optional.scope.include_extensions | array | No | Optional extension allowlist (for example, .pdf, .csv, .parquet) | — | — |
| optional.scope.include_extensions[] | string | No | — | — | — |
| optional.scope.include_object_metadata | boolean | No | Attach provider metadata (etag, size, content-type hints, timestamps) to asset checksums | true | — |
| optional.scope.prefix | string | No | Object key prefix filter (for example, exports/2026/) | — | — |
Examples
Reference payloads generated from shared source examples JSON.
Azure Blob container validation scan
Validate Azure Blob extraction with a low random sample and connection string auth
Schedule
{
"enabled": true,
"preset": "weekly",
"cron": "17 3 * * 0",
"timezone": "UTC"
}Config Payload
{
"type": "AZURE_BLOB_STORAGE",
"required": {
"account_url": "https://acmestorage.blob.core.windows.net",
"container": "finance-archive"
},
"masked": {
"azure_connection_string": "DefaultEndpointsProtocol=https;AccountName=acmestorage;AccountKey=<key>;EndpointSuffix=core.windows.net"
},
"optional": {
"scope": {
"prefix": "2026/",
"exclude_extensions": [
".png",
".jpg"
],
"include_empty_objects": false
},
"connection": {
"request_timeout_seconds": 45,
"max_keys_per_page": 250
}
},
"sampling": {
"strategy": "RANDOM",
"limit": 25
}
}