Monitor Configuration

A Monitor’s configuration has two parts: the runner (how the production output is judged) and the resolution (how the judge’s output maps to a verdict).

Runner types

Pick one of three runner types based on what you want the judge to do.

Model

Calls a chat model with a custom prompt. Useful when you want an LLM to look at the production input and output and emit a structured judgment (e.g. {"is_bad": true, "reason": "..."}).

Resource: any model your Company can call (gpt-4o-mini, your finetunes, etc.)
Input: the prompt template plus the JSON paths from the production payload to feed into the model (typically messages + response).

Workflow

Calls a Maitai Workflow with the production payload as input. Useful when judgment requires a multi-step process (retrieval, structured prompting, deterministic post-processing) — basically anywhere a Workflow already encodes the right logic.

Resource: a Workflow in your Company
Input: the payload mapping the workflow expects

Direct

Skips the judge entirely. The resolution is evaluated directly against the production output. Use this when the production system already emits the structured signal you want to alert on (e.g. a not_enough_information flag, a confidence score, a boolean classifier output) — no second model call needed.

Resource: none
Input: just the JSON path to the field that holds the production output (default: response)

Resolution

The resolution describes how to read the runner’s output and decide the verdict. It has three branches:

Error — boolean group of conditions; if it matches, the verdict is error.
Warning — same shape; if it matches (and error did not), the verdict is warning.
Default outcome — the verdict when neither error nor warning matches (always pass today).

Each branch is a tree of groups (combined with AND / OR, optionally negated) and conditions. A condition picks a JSON dot-path or text segment of the runner output, an operator, and an optional value.

Operators

Source	Operators
`text`	`contains`, `not_contains`, `equals`, `not_equals`, `regex_matches`, `is_empty`, `is_not_empty`
`json`	`exists`, `does_not_exist`, `is_true`, `is_false`, `equals`, `not_equals`, `contains`, `regex_matches`, `gt`, `gte`, `lt`, `lte`

JSON paths use dotted access with array indices: confidence.score, items.0.id, flags.is_bad.

Example: direct runner watching for a missing field

{
  "runner": {
    "resource_type": "direct",
    "input": {"response_path": "response"}
  },
  "result": {
    "error": {
      "id": "g_err",
      "kind": "group",
      "op": "OR",
      "children": [
        {
          "id": "c1",
          "kind": "condition",
          "source": "json",
          "path": "customer_id",
          "operator": "does_not_exist",
          "value": ""
        }
      ]
    },
    "warning": {"id": "g_warn", "kind": "group", "op": "OR", "children": []},
    "default_outcome": "pass"
  }
}

Designing against a real sample

The new-Monitor wizard fetches the most recent production payload from each attached target so you can:

See the actual JSON shape the runner will receive
Autocomplete path fields in the resolution editor against real keys
Preview the verdict against that sample before saving

If a target has no completed invocations yet, the wizard shows a “no sample yet” message instead of guessing.

Live preview

The detail page exposes a Preview action that runs the current draft (or any published version, or an ad-hoc payload) and returns the verdict + decorated trace without persisting a monitor_run. Use this to iterate on conditions before publishing.

Save vs publish

Saving the Monitor edits the working draft. Publishing (see Versions) freezes the current draft as an immutable snapshot that runs in production. Next: Targets.

Documentation Index

​Runner types

​Model

​Workflow

​Direct

​Resolution

​Operators

​Example: direct runner watching for a missing field

​Designing against a real sample

​Live preview

​Save vs publish