1) Create a Dataset
The first step is gathering the data you want the model to learn from.- Go to Forge > Finetuning.
- Click New Dataset.
- Choose a source (e.g., Production Traffic to filter real requests, or File Upload).
- Save the dataset.
2) Prepare & Refine Data
Once the dataset is created, you can improve its quality.- Open the Dataset.
- Use AI Review to spot inconsistencies.
- Edit individual requests to fix errors or improve the “ideal” response.
- Apply Modifications (regex) or Augmentations (synthetic variations) if needed.
3) Create a Composition
A Composition is a “recipe” that combines one or more datasets. This lets you mix and match data (e.g., “80% production data + 20% golden test set”) without duplicating the underlying requests.- Go to Forge > Compositions.
- Click New Composition.
- Select the datasets to include.
- Adjust sampling weights if desired.
4) Start a Fine-tune Run
Now you’re ready to train.- Go to Forge > Finetuning (or start directly from a Composition).
- Click New Finetune Run.
- Select your Composition.
- Choose the Base Model (e.g.,
llama-3.1-8b-instruct). - Configure hyperparameters (epochs, learning rate) or stick to defaults.
- Start the run.
5) Validate the Model
After training completes, Maitai automatically runs a validation step (if configured) or you can run one manually.- Open the Fine-tune Run.
- Go to the Validation tab.
- Review the Validation Run results to see how the new model performed against a holdout set or specific Test Set.
- Check metrics like accuracy and pass rate.
6) Benchmarking against Golden Test Sets
Once you have a candidate model, the final hurdle before deployment is ensuring it actually outperforms your baseline on high-signal cases.- Go to Test > Test Sets.
- Select your Golden Test Set for this intent.
- Start a New Test Run using your newly fine-tuned model.
- Use the Compare Runs feature to side-by-side the results of the new model against your current production model.
- Verify that the new model maintains or improves performance on critical “golden” examples.
7) Deploy
If the model performs well and passes your golden benchmarks:- Click Deploy on the Fine-tune Run page.
- The model becomes available as a
ft:...ID that you can use in your Applications or Intents configuration.