SynAE - Synthetic Benchmark Evaluator

Step 1 - Upload Original And Synthetic Benchmarks

SynAE Input Format

Data Tool Calls Output Attribute1 Attribute2 …

All benchmark datasets (original and synthetic) must have the same columns.
Data is required. Include Tool Calls, Output, or both. At least one is required alongside Data. Attribute columns are dataset-specific and optional. Use these to label or categorize traces.
If an import format is selected below, uploaded trace files will be automatically converted to the SynAE format before evaluation.
Learn more about the SynAE input format and benchmark-specific configuration →

Import format Optional Original benchmark dataset CSV

Synthetic benchmark datasets CSV: add one, or multiple to compare

Add Synthetic

No synthetic benchmark datasets added yet.

Benchmark-specific configuration

Select example config

Data column name Required Tool calls column name Optional Output column name Optional Attribute column names Optional, comma-separated List of tools available to the agent Optional, comma-separated Benchmark task description Optional, used for validity evaluation

Step 3 - Analyze SynAE Results

Upload your datasets in Step 1, configure metrics in Step 2, then click Run evaluation through the SynAE backend. Evaluation runs on the first 100 rows of each dataset. Alternatively, load a previously computed result JSON.

Run evaluation

Evaluate your synthetic datasets against the original via the SynAE backend. Results appear automatically when done.

Load results

Upload a previously computed results JSON.

Results

Compare