Skip to main content
Reasoning Augmentation enriches your fine-tuning dataset by adding structured chain-of-thought reasoning to each training example. An LLM analyzes every request in your dataset and extracts key information based on a schema you define, then prepends that reasoning to the assistant’s response during training. The result: your fine-tuned model learns to “think before it speaks,” producing more accurate and consistent outputs.

How It Works

  1. You define a reasoning schema — a set of fields with extraction instructions tailored to your use case.
  2. Maitai runs an LLM over every request in your dataset to extract values for each field.
  3. The extracted reasoning is stored as structured metadata on each dataset request.
  4. During fine-tuning, the reasoning is automatically prepended to each response and converted to model-specific tags (e.g., <think> for DeepSeek-style models).

Starting a Reasoning Augmentation

Navigate to your dataset’s Augmentations tab. You’ll see both Reasoning Augmentation and Variation Augmentation buttons. Augmentations Tab Click Reasoning Augmentation to open the schema editor.

Defining a Reasoning Schema

The modal lets you define the fields you want the LLM to extract from each request in your dataset. Each field has two parts:
  • Field name — a short label for the extracted value (e.g., Intent, Sentiment, Key Entities).
  • Instruction — a natural-language prompt telling the LLM what to look for (e.g., “Determine what the user is asking for”).
Reasoning Augmentation Modal Click Add Field to add more fields to your schema. You can remove any field with the trash icon. Once you’ve defined at least one valid field, click Start Augmentation to kick off the job.
Design your schema around the decision points of your task. Think about what intermediate facts or reasoning steps your model needs before producing its final answer. For example:
  • Customer support: Customer Intent, Urgency, Relevant Policy
  • Content moderation: Topic, Tone, Violation Type
  • Data extraction: Key Entities, Relationships, Confidence
  • Code generation: Requirements, Edge Cases, Approach

Monitoring Progress

After starting the augmentation, a new entry appears in the Augmentations table. You can track its status as it progresses from CREATEDRUNNINGCOMPLETED. Once completed, click View to inspect the extracted reasoning for each request. Each entry shows the key-value pairs the LLM extracted based on your schema.

Using Reasoning in Fine-tuning

When you include a reasoning-augmented dataset in a Composition and start a Fine-tune Run, the reasoning is automatically handled:
  • The structured reasoning is prepended to each assistant response in the training data.
  • The reasoning tags are automatically converted to the appropriate model-specific format for your chosen base model (e.g., <think> for DeepSeek-style models).
No extra configuration is needed — once reasoning is attached to your dataset, it flows through the entire fine-tuning pipeline.

Next