Data Generation - Maitai

Using a Synthetic Conversation Tree (SCT), Maitai can “walk” the tree and generate thousands of unique, valid conversation paths automatically.

Why use Synthetic Data?

Bootstrap New Applications: Get a high-quality model before you have a single real user interaction.
Cover Edge Cases: Generate examples for rare or dangerous scenarios that you don’t want to wait for in production.
Balance your Data: If your production data is heavily skewed (e.g., 90% simple questions, 10% complex), you can use synthetic data to ensure your model is well-trained on complex cases.

Workflow: From Tree to Fine-tuning

Define the Tree: Map out the core logic of your intent or agent in the visualizer.
Generate Samples: Use Maitai to generate a set of conversations based on the tree logic.
Review & Refine: Inspect the generated conversations to ensure they meet your quality standards.
Create Dataset: Export the generated paths into a Fine-tune Dataset or a Test Set.
Fine-tune: Train your model on this robust, structured data.

Next

Create a dataset from traffic or synthetic data: Dataset Creation
Full end-to-end walkthrough: Fine-tune a Model

Visualizer Configuration

⌘I