Dataset Export Guide
The dataset explorer and automation scripts share the same export pipeline: a FastAPI
endpoint that streams a .eval752.zip archive. This guide summarizes how to retrieve
packages, how filtering impacts the archive, and how the UI exposes the workflow.
Endpoint Summary
GET /datasets/{dataset_id}/export
Response: 200 OK with Content-Type: application/zip and Content-Disposition: attachment; filename=...eval752.zip.
The archive always contains:
manifest.json— generation metadata, section list, and counts that honor filters.meta.json— optional description/schema metadata.sections/*.jsonl— one JSONL file per exported section.assets/...— decoded binary assets referenced by exported items.- When
run_idis provided:run_config.json,results.jsonl, and LightEval config (if available).
Section Filtering
- The backend accepts any number of
section_idsparameters. IDs that do not belong to the dataset trigger400 Bad Requestto prevent accidental mismatches. - Filters apply to both dataset content and run artifacts.
results.jsonlonly includes items from the exported sections, so downstream tools never see dangling references. - When no
section_idsare specified, all sections are exported.
Example:
Run Selection
Passing run_id instructs the service to embed the matching run summary,
run_config.json, LightEval config (without API keys), and results.jsonl.
The run must belong to the dataset; otherwise the API returns 400 with a descriptive error.
Combine run_id with section_ids to focus on a subset of sections while keeping the run
metrics for those items only.
UI & UX Notes
- The dataset list still exposes a one-click “Export” button for full packages.
- The in-app Dataset Explorer now adds an Export view control that pulls the filtered sections a user is reviewing. Microcopy on the button clarifies that search keywords are for previewing only, while section filters define the export scope—reducing surprise for users who expect WYSIWYG downloads.
- Run detail panels continue to provide the most direct “export with run results” entry point.
Keep these cues aligned in future design work so that the mental model remains consistent: filters = structural subsets, runs = behavioral context.
