Custom
Schema-driven detector documentation.
CUSTOMactiveP061 params19 examples
Detector Metadata
Capability catalog entry from
all_detectors.json.Categories
CLASSIFICATIONCOMPLIANCE
Supported Asset Types
TXTTABLEURLIMAGE
Recommended Model
mDeBERTa-v3 + SetFit + GLiNER + HuggingFace transformersNotes
User-defined rules and pipelines tailored to specific business needs. Supports regex, GLiNER2, LLM, text classification, image classification, feature extraction, and object detection pipelines.
Parameters
Configuration parameters for the Custom detector. Shared from `CustomDetectorConfig`.
| Parameter | Type | Required | Description | Default | Constraints |
|---|---|---|---|---|---|
| custom_detector_key | string | Yes | Stable key used to identify one custom detector instance | — | — |
| name | string | Yes | User-facing name of custom detector | — | — |
| description | string | No | — | — | — |
| method | enum | No | Execution method for custom detector logic Allowed values: RULESET, CLASSIFIER, ENTITY, PIPELINE | — | — |
| languages | array | No | — | ["de","en"] | — |
| languages[] | string | No | — | — | — |
| ruleset | object | No | — | — | no extra properties |
| ruleset.regex_rules | array | No | — | [] | — |
| ruleset.regex_rules[] | object | No | — | — | no extra properties |
| ruleset.regex_rules[].id | string | Yes | Stable ID for this regex rule | — | — |
| ruleset.regex_rules[].name | string | Yes | Display name for this regex rule | — | — |
| ruleset.regex_rules[].pattern | string | Yes | Regular expression pattern | — | — |
| ruleset.regex_rules[].flags | string | No | Regex flags (for example i, m, s) | — | |
| ruleset.regex_rules[].severity | enum | No | Severity level of finding Allowed values: critical, high, medium, low, info | — | — |
| ruleset.keyword_rules | array | No | — | [] | — |
| ruleset.keyword_rules[] | object | No | — | — | no extra properties |
| ruleset.keyword_rules[].id | string | Yes | Stable ID for this keyword rule | — | — |
| ruleset.keyword_rules[].name | string | Yes | Display name for this keyword rule | — | — |
| ruleset.keyword_rules[].keywords | array | Yes | Keyword set to match | — | min items 1 |
| ruleset.keyword_rules[].keywords[] | string | Yes | — | — | — |
| ruleset.keyword_rules[].case_sensitive | boolean | No | Whether keyword matching is case-sensitive | false | — |
| ruleset.keyword_rules[].severity | enum | No | Severity level of finding Allowed values: critical, high, medium, low, info | — | — |
| classifier | object | No | — | — | no extra properties |
| classifier.labels | array | No | — | [] | — |
| classifier.labels[] | object | No | — | — | no extra properties |
| classifier.labels[].id | string | Yes | — | — | — |
| classifier.labels[].name | string | Yes | — | — | — |
| classifier.labels[].description | string | No | — | — | — |
| classifier.zero_shot_model | string | No | — | MoritzLaurer/mDeBERTa-v3-base-mnli-xnli | — |
| classifier.hypothesis_template | string | No | — | This text contains {}. | — |
| classifier.training_examples | array | No | — | [] | — |
| classifier.training_examples[] | object | No | — | — | no extra properties |
| classifier.training_examples[].text | string | Yes | — | — | — |
| classifier.training_examples[].label | string | Yes | — | — | — |
| classifier.training_examples[].accepted | boolean | No | — | true | — |
| classifier.training_examples[].source | string | No | Origin of this example (editor/feedback/import) | editor | — |
| classifier.min_examples_per_label | integer | No | — | 8 | min 1 |
| classifier.setfit_model | string | No | — | sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 | — |
| entity | object | No | — | — | no extra properties |
| entity.entity_labels | array | No | — | [] | — |
| entity.entity_labels[] | string | No | — | — | — |
| entity.entity_descriptions | object | No | Optional GLiNER2 schema descriptions keyed by entity label | {} | — |
| entity.model | string | No | — | fastino/gliner2-base-v1 | — |
| extractor | object | No | Optional structured extraction — runs when detector fires | — | no extra properties |
| extractor.enabled | boolean | No | — | true | — |
| extractor.fields | array | Yes | — | — | min items 1 |
| extractor.fields[] | object | Yes | One output field in the extraction schema | — | no extra properties |
| extractor.fields[].name | string | Yes | Output field name — becomes a key in extracted_data JSON | — | — |
| extractor.fields[].description | string | No | Human-readable hint for what this field captures | — | — |
| extractor.fields[].type | enum | No | Allowed values: string, number, boolean, list[string], list[number] | string | — |
| extractor.fields[].entity_label | string | No | GLiNER2 schema label used for extraction (ENTITY and CLASSIFIER methods) | — | — |
| extractor.fields[].regex_pattern | string | No | Regex with one named capture group (?P<value>...) for RULESET method | — | — |
| extractor.fields[].regex_flags | string | No | Regex flags: i=case-insensitive, m=multiline, s=dotall | i | — |
| extractor.fields[].aggregate | enum | No | How to aggregate multiple matches Allowed values: first, last, list, join, count | list | — |
| extractor.fields[].join_separator | string | No | — | , | — |
| extractor.fields[].min_confidence | number | No | Minimum GLiNER confidence for this field | 0.4 | min 0, max 1 |
| extractor.fields[].required | boolean | No | If true, skip saving extraction when this field is empty | false | — |
| extractor.gliner_model | string | No | — | fastino/gliner2-base-v1 | — |
| extractor.content_limit | integer | No | Chars of content to pass to extractor (classifier matched_content is only 320 chars) | 4000 | min 320, max 8192 |
| pipeline_schema | object | No | — | — | — |
| max_findings | integer | null | No | Maximum number of findings to return per asset | null | — |