Dataset Format
eval_752 stores evaluation items in JSONL (one JSON object per line). This page describes the format so you can understand what's in an import, build datasets by hand, or debug column mappings.
If you're building datasets through the UI (Dataset Builder or Hugging Face import), eval_752 handles the format for you. This page is a reference for when you want to inspect or manually construct items.
Item Structure
Each line in the JSONL file represents one evaluation item:
Field Reference
Supported Item Types
mcq_single— One correct answer fromchoices.answeris a string matching one choice.mcq_multi— Multiple correct answers.answeris an array of strings.freeform— Open-ended text response.answeris the reference text (used by judge or regex scoring).code— Code generation task.answercontains reference solution.judge_pairwise— Two-response comparison for arena mode.
Multi-Modal Items
For items that include images, reference them with relative paths under assets/:
When packaged in an .eval752.zip, image files live under the assets/ directory. During a run, eval_752 sends these images to the model if the provider supports vision inputs.
The .eval752.zip Package
An .eval752.zip bundle is a self-contained dataset (or result) package:
This format is used for:
- Sharing datasets between eval_752 instances
- Exporting and archiving run results
- Importing Browser Harness captures
For import workflows, see Working with Datasets.
