Google Cloud Storage
Schema-driven source documentation.
GOOGLE_CLOUD_STORAGE38 fields1 examples
Commonly Asked Questions
Assistant knowledge mapped to this source type from
assistant_knowledge.json.Required
Fields required for a valid configuration payload under `config.required`.
| Path | Type | Required | Description | Default | Constraints |
|---|---|---|---|---|---|
| required | object | Yes | — | — | no extra properties |
| required.bucket | string | Yes | Google Cloud Storage bucket name | — | — |
Masked
Sensitive fields under `config.masked` (secrets/credentials).
| Path | Type | Required | Description | Default | Constraints |
|---|---|---|---|---|---|
| masked | object | No | Optional inline service account credentials JSON. Leave empty to use ADC/workload identity. | — | no extra properties |
| masked.gcp_credentials_json | string | No | Google service account credentials JSON as inline string | — | — |
Optional
Optional configuration fields under `config.optional`.
| Path | Type | Required | Description | Default | Constraints |
|---|---|---|---|---|---|
| optional | object | No | — | — | no extra properties |
| optional.connection | object | No | — | — | no extra properties |
| optional.connection.gcp_credentials_file | string | No | Path to Google service account JSON credentials file | — | — |
| optional.connection.max_keys_per_page | integer | No | Maximum objects requested per list page | 200 | min 1, max 1000 |
| optional.connection.max_object_bytes | integer | No | Maximum bytes downloaded per object for MIME detection and text extraction | 5242880 | min 1024, max 52428800 |
| optional.connection.project_id | string | No | Optional GCP project ID override for auth context and bucket listing | — | — |
| optional.connection.request_timeout_seconds | number | No | Network timeout in seconds for list/download operations | 30 | min 1, max 300 |
| optional.scope | object | No | Object scope and filtering controls. | — | no extra properties |
| optional.scope.exclude_extensions | array | No | Optional extension denylist | — | — |
| optional.scope.exclude_extensions[] | string | No | — | — | — |
| optional.scope.include_content_preview | boolean | No | Download object bytes to infer MIME and extract detector-ready text previews | true | — |
| optional.scope.include_empty_objects | boolean | No | Include zero-byte objects in extraction results | false | — |
| optional.scope.include_extensions | array | No | Optional extension allowlist (for example, .pdf, .csv, .parquet) | — | — |
| optional.scope.include_extensions[] | string | No | — | — | — |
| optional.scope.include_object_metadata | boolean | No | Attach provider metadata (etag, size, content-type hints, timestamps) to asset checksums | true | — |
| optional.scope.prefix | string | No | Object key prefix filter (for example, exports/2026/) | — | — |
Examples
Reference payloads generated from shared source examples JSON.
GCS full bucket sweep
Scan all objects in a GCS bucket using Application Default Credentials
Schedule
{
"enabled": true,
"preset": "nightly",
"cron": "26 1 * * *",
"timezone": "UTC"
}Config Payload
{
"type": "GOOGLE_CLOUD_STORAGE",
"required": {
"bucket": "prod-data-lake"
},
"masked": {},
"optional": {
"connection": {
"project_id": "acme-prod"
},
"scope": {
"prefix": "exports/"
}
},
"sampling": {
"strategy": "ALL"
}
}