Custom

Schema-driven detector documentation.

CUSTOMactiveP061 params19 examples

Detector Metadata

Capability catalog entry from all_detectors.json.

Categories

CLASSIFICATIONCOMPLIANCE

Supported Asset Types

TXTTABLEURLIMAGE

Recommended Model

mDeBERTa-v3 + SetFit + GLiNER + HuggingFace transformers

Notes

User-defined rules and pipelines tailored to specific business needs. Supports regex, GLiNER2, LLM, text classification, image classification, feature extraction, and object detection pipelines.

Parameters

Configuration parameters for the Custom detector. Shared from `CustomDetectorConfig`.

Parameter	Type	Required	Description	Default	Constraints
custom_detector_key	string	Yes	Stable key used to identify one custom detector instance	—	—
name	string	Yes	User-facing name of custom detector	—	—
description	string	No	—	—	—
method	enum	No	Execution method for custom detector logic Allowed values: RULESET, CLASSIFIER, ENTITY, PIPELINE	—	—
languages	array	No	—	["de","en"]	—
languages[]	string	No	—	—	—
ruleset	object	No	—	—	no extra properties
ruleset.regex_rules	array	No	—	[]	—
ruleset.regex_rules[]	object	No	—	—	no extra properties
ruleset.regex_rules[].id	string	Yes	Stable ID for this regex rule	—	—
ruleset.regex_rules[].name	string	Yes	Display name for this regex rule	—	—
ruleset.regex_rules[].pattern	string	Yes	Regular expression pattern	—	—
ruleset.regex_rules[].flags	string	No	Regex flags (for example i, m, s)		—
ruleset.regex_rules[].severity	enum	No	Severity level of finding Allowed values: critical, high, medium, low, info	—	—
ruleset.keyword_rules	array	No	—	[]	—
ruleset.keyword_rules[]	object	No	—	—	no extra properties
ruleset.keyword_rules[].id	string	Yes	Stable ID for this keyword rule	—	—
ruleset.keyword_rules[].name	string	Yes	Display name for this keyword rule	—	—
ruleset.keyword_rules[].keywords	array	Yes	Keyword set to match	—	min items 1
ruleset.keyword_rules[].keywords[]	string	Yes	—	—	—
ruleset.keyword_rules[].case_sensitive	boolean	No	Whether keyword matching is case-sensitive	false	—
ruleset.keyword_rules[].severity	enum	No	Severity level of finding Allowed values: critical, high, medium, low, info	—	—
classifier	object	No	—	—	no extra properties
classifier.labels	array	No	—	[]	—
classifier.labels[]	object	No	—	—	no extra properties
classifier.labels[].id	string	Yes	—	—	—
classifier.labels[].name	string	Yes	—	—	—
classifier.labels[].description	string	No	—	—	—
classifier.zero_shot_model	string	No	—	MoritzLaurer/mDeBERTa-v3-base-mnli-xnli	—
classifier.hypothesis_template	string	No	—	This text contains {}.	—
classifier.training_examples	array	No	—	[]	—
classifier.training_examples[]	object	No	—	—	no extra properties
classifier.training_examples[].text	string	Yes	—	—	—
classifier.training_examples[].label	string	Yes	—	—	—
classifier.training_examples[].accepted	boolean	No	—	true	—
classifier.training_examples[].source	string	No	Origin of this example (editor/feedback/import)	editor	—
classifier.min_examples_per_label	integer	No	—	8	min 1
classifier.setfit_model	string	No	—	sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2	—
entity	object	No	—	—	no extra properties
entity.entity_labels	array	No	—	[]	—
entity.entity_labels[]	string	No	—	—	—
entity.entity_descriptions	object	No	Optional GLiNER2 schema descriptions keyed by entity label	{}	—
entity.model	string	No	—	fastino/gliner2-base-v1	—
extractor	object	No	Optional structured extraction — runs when detector fires	—	no extra properties
extractor.enabled	boolean	No	—	true	—
extractor.fields	array	Yes	—	—	min items 1
extractor.fields[]	object	Yes	One output field in the extraction schema	—	no extra properties
extractor.fields[].name	string	Yes	Output field name — becomes a key in extracted_data JSON	—	—
extractor.fields[].description	string	No	Human-readable hint for what this field captures	—	—
extractor.fields[].type	enum	No	Allowed values: string, number, boolean, list[string], list[number]	string	—
extractor.fields[].entity_label	string	No	GLiNER2 schema label used for extraction (ENTITY and CLASSIFIER methods)	—	—
extractor.fields[].regex_pattern	string	No	Regex with one named capture group (?P<value>...) for RULESET method	—	—
extractor.fields[].regex_flags	string	No	Regex flags: i=case-insensitive, m=multiline, s=dotall	i	—
extractor.fields[].aggregate	enum	No	How to aggregate multiple matches Allowed values: first, last, list, join, count	list	—
extractor.fields[].join_separator	string	No	—	,	—
extractor.fields[].min_confidence	number	No	Minimum GLiNER confidence for this field	0.4	min 0, max 1
extractor.fields[].required	boolean	No	If true, skip saving extraction when this field is empty	false	—
extractor.gliner_model	string	No	—	fastino/gliner2-base-v1	—
extractor.content_limit	integer	No	Chars of content to pass to extractor (classifier matched_content is only 320 chars)	4000	min 320, max 8192
pipeline_schema	object	No	—	—	—
max_findings	integer \| null	No	Maximum number of findings to return per asset	null	—