Why use Synthetic Data?
- Bootstrap New Applications: Get a high-quality model before you have a single real user interaction.
- Cover Edge Cases: Generate examples for rare or dangerous scenarios that you don’t want to wait for in production.
- Balance your Data: If your production data is heavily skewed (e.g., 90% simple questions, 10% complex), you can use synthetic data to ensure your model is well-trained on complex cases.
Workflow: From Tree to Fine-tuning
- Define the Tree: Map out the core logic of your intent or agent in the visualizer.
- Generate Samples: Use Maitai to generate a set of conversations based on the tree logic.
- Review & Refine: Inspect the generated conversations to ensure they meet your quality standards.
- Create Dataset: Export the generated paths into a Fine-tune Dataset or a Test Set.
- Fine-tune: Train your model on this robust, structured data.
Next
- Create a dataset from traffic or synthetic data: Dataset Creation
- Full end-to-end walkthrough: Fine-tune a Model